Commit graph

23 commits

Author SHA1 Message Date
Andreas Kling
21db2b7b90 Everywhere: Remove NonnullOwnPtr.h includes 2023-03-06 23:46:35 +01:00
Sam Atkins
6c66fd5ffb LibRegex: Remove declarations for non-existent methods 2023-01-27 20:33:18 +00:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Timothy Flynn
3729fd06fa LibRegex: Do not return an Optional from Regex::Matcher::execute
The code path that could return an optional no longer exists as of
commit: a962ee020a
2022-02-05 19:06:50 +03:30
Ali Mohammad Pur
d2e51fafa9 LibRegex: Merge alternations based on blocks and not instructions
The instructions can have dependencies (e.g. Repeat), so only unify
equal blocks instead of consecutive instructions.
Fixes #11247.

Also adds the minimal test case(s) from that issue.
2021-12-15 19:36:45 +03:30
Andreas Kling
8b1108e485 Everywhere: Pass AK::StringView by value 2021-11-11 01:27:46 +01:00
Brian Gianforcaro
fdea5e1628 LibRegex: Pass RegexStringView and Vector<RegexStringView> by reference
Flagged by pvs-studio, it looks like these were intended to be passed by
reference originally, but it was missed. This avoids excessive argument
copy when searching / matching in the regex API.

Before:

    Command: /usr/Tests/LibRegex/Regex --bench
    Average time: 5998.29 ms (median: 5991, stddev: 102.18)

After:

    Command: /usr/Tests/LibRegex/Regex --bench
    Average time: 5623.2 ms (median: 5623, stddev: 86.25)
2021-09-16 17:17:13 +02:00
Ali Mohammad Pur
246ab432ff LibRegex: Add a basic optimization pass
This currently tries to convert forking loops to atomic groups, and
unify the left side of alternations.
2021-09-13 14:38:53 +04:30
Timothy Flynn
02e3633b7f AK: Move FormatParser definition from header to implementation file
This is primarily to be able to remove the GenericLexer include out of
Format.h as well. A subsequent commit will add AK::Result to
GenericLexer, which will cause naming conflicts with other structures
named Result. This can be avoided (for now) by preventing nearly every
file in the system from implicitly including GenericLexer.

Other changes in this commit are to add the GenericLexer include to
files where it is missing.
2021-08-19 23:49:25 +02:00
Timothy Flynn
a0b72f5ad3 LibRegex: Remove (mostly) unused regex::MatchOutput
This struct holds a counter for the number of executed operations, and
vectors for matches, captures groups, and named capture groups. Each of
the vectors is unused. Remove the struct and just keep a separate
counter for the executed operations.
2021-08-15 11:43:45 +01:00
Timothy Flynn
f1ce998d73 LibRegex+LibJS: Combine named and unnamed capture groups in MatchState
Combining these into one list helps reduce the size of MatchState, and
as a result, reduces the amount of memory consumed during execution of
very large regex matches.

Doing this also allows us to remove a few regex byte code instructions:
ClearNamedCaptureGroup, SaveLeftNamedCaptureGroup, and NamedReference.
Named groups now behave the same as unnamed groups for these operations.
Note that SaveRightNamedCaptureGroup still exists to cache the matched
group name.

This also removes the recursion level from the MatchState, as it can
exist as a local variable in Matcher::execute instead.
2021-08-15 11:43:45 +01:00
Ali Mohammad Pur
d5984d296f LibRegex: Make Matcher<>::match(Vector<>) take a reference to the vector
It was previously copying the entire vector every time, which is not a
nice thing to do. :^)
2021-08-02 17:22:50 +04:30
Ali Mohammad Pur
5f342e4fa9 LibRegex: Make Fork{Jump,Stay} non-recursive
This makes very fork-heavy expressions (like `(aa)*`) not run out of
stack space when matching very long strings.
2021-08-02 17:22:50 +04:30
Timothy Flynn
1400e3cf58 LibRegex: Allow separately parsing patterns and creating Regex objects
Adds a static method to parse a regex pattern and return the result, and
a constructor to accept a parse result. This is to allow LibJS to parse
the pattern string of a RegExpLiteral once and hand off regex objects
any number of times thereafter.
2021-07-30 21:26:31 +01:00
Timothy Flynn
b162517065 LibRegex: Take ownership of pattern string and fix move operations
The Regex object created a copy of the pattern string anyways, so tweak
the constructor to allow callers to move() pattern strings into the
regex.

The Regex move constructor and assignment operator currently result in
memory corruption. The Regex object stores a Matcher object, which holds
a reference to the Regex object. So when the Regex object is moved, that
reference is no longer valid. To fix this, the reference stored in the
Matcher must be updated when the Regex is moved.
2021-07-30 21:26:31 +01:00
Ali Mohammad Pur
36bfc912fc LibRegex: Switch to east-const style 2021-07-23 21:19:21 +04:30
Timothy Flynn
0f0ac37b56 LibRegex: Break from execution loop when the sticky flag is set
If the sticky flag is set, the regex execution loop should break
immediately even if the execution was a failure. The specification for
several RegExp.prototype methods (e.g. exec and @@split) rely on this
behavior.
2021-07-09 19:45:55 +01:00
Andrew Kaster
5e8a0c014e LibRegex: Make regex::Regex move-constructible and move-assignable
For some reason the default move constructor and default move-assign
operator were deleted, so we explicitly default them instead.
2021-06-30 08:18:28 +04:30
Linus Groh
d60ebbbba6 Revert "Userland: static vs non-static constexpr variables"
This reverts commit 800ea8ea96.

Booting the system no longer worked after these changes.
2021-05-21 10:30:52 +01:00
Lenny Maiorani
800ea8ea96 Userland: static vs non-static constexpr variables
Problem:
- `static` variables consume memory and sometimes are less
  optimizable.
- `static const` variables can be `constexpr`, usually.
- `static` function-local variables require an initialization check
  every time the function is run.

Solution:
- If a global `static` variable is only used in a single function then
  move it into the function and make it non-`static` and `constexpr`.
- Make all global `static` variables `constexpr` instead of `const`.
- Change function-local `static const[expr]` variables to be just
  `constexpr`.
2021-05-21 10:07:06 +01:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Renamed from Libraries/LibRegex/RegexMatcher.h (Browse further)