Commit graph

27 commits

Author SHA1 Message Date
Ali Mohammad Pur
17087ac4a2 LibJS: Unescape incorrectly escaped code units in regex patterns
We were translating the pattern [\⪾-\⫀] to [\\u2abe-\\u2ac0], which
is a very different pattern; as a code unit converted to the \uhhh
format has no meaning when escaped, this commit makes us simply skip
escaping it when translating the pattern.
2023-09-16 15:21:09 +02:00
Ali Mohammad Pur
bcfbe0fbf7 LibJS: Manually loop over escaped regex pattern instead of ::replace()
This makes it ever-so-slightly faster, but more importantly, it fixes
the bug where a `/\//` regex's `source` property would return `\\/`
("\\\\/") instead of `\/` due to the existing '/' -> '\/' replace()
call.
2023-02-16 21:03:19 +01:00
leeight
0d96468e9b LibJS: Implement RegExp legacy static properties
RegExp legacy static properties Spec url is https://github.com/tc39/proposal-regexp-legacy-features
2022-10-17 17:08:33 +02:00
Timothy Flynn
a803d9226f LibJS: Always access RegExp flags by its "flags" property
This is a normative change in the ECMA-262 spec. See:
https://github.com/tc39/ecma262/commit/35b7eb2

Note there is a bit of weirdness between the mainline spec and the set
notation proposal as the latter has not been updated with this change.
For now, this implements what the spec PR and other prototypes indicate
how the proposal will behave.
2022-08-25 16:39:45 +01:00
Ali Mohammad Pur
f4b26b0cea LibJS: Hook up the 'v' (unicodeSets) RegExp flag 2022-07-20 21:25:59 +01:00
Timothy Flynn
2530b6adf0 LibJS: Create the RegExpExec result's "input" field last
We move the input string into this field to avoid a string copy, so we
must do this step last to avoid using any views into it (note that
match.view here is a view into this string).
2021-11-08 01:36:29 +01:00
Timothy Flynn
6337eb52d8 LibJS: Implement RegExp.prototype.compile
This is an Annex B extension to RegExp.prototype.
2021-08-20 19:16:33 +02:00
Timothy Flynn
b3569fab7c LibJS: Add tests for Unicode property escapes
LibJS gets this for free from LibRegex, but let's add test cases for it.
2021-07-30 21:26:31 +01:00
Timothy Flynn
f1dd770a8a LibJS: Parse RegExp literals at AST creation time, not execution time
The spec requires that invalid RegExp literals must cause a Syntax Error
before the JavaScript is executed. See:
https://tc39.es/ecma262/#sec-patterns-static-semantics-early-errors

This is explicitly tested in the RegExp/property-escapes test262 tests.
For example, see unsupported-property-Line_Break.js:

    $DONOTEVALUATE();
    /\p{Line_Break}/u;

That RegExp literal is invalid because Line_Break is not a supported
Unicode property. $DONOTEVALUATE() just throws an exception when it is
executed. The test expects that this file will fail to be parsed.

Note that RegExp patterns can still be parsed at execution time by way
of "new RegExp(...)".
2021-07-30 21:26:31 +01:00
Timothy Flynn
6c67de8186 LibJS: Implement RegExp.prototype.hasIndices proposal
https://tc39.es/proposal-regexp-match-indices/
2021-07-10 16:49:35 +01:00
Timothy Flynn
d1e06b00e3 LibJS: Parse the RegExp.prototype.hasIndices flag 2021-07-10 16:49:35 +01:00
Timothy Flynn
3892b6e6ec LibJS: Implement RegExp constructor according to the spec
This allows passing an existing RegExp object (or an object that is
sufficiently like a RegExp object) as the "pattern" argument of the
RegExp constructor.
2021-07-09 19:45:55 +01:00
Timothy Flynn
6c53475143 LibJS: Implement RegExp.prototype.test with RegExpExec abstraction 2021-07-08 00:01:20 +01:00
Linus Groh
d85b9fd5a0 LibJS: Bring back runtime validation of RegExp flags
This is a partial revert of commit 60064e2, which removed the validation
of RegExp flags during runtime and expected the parser to do that
exclusively - however this was not taking into account the RegExp()
constructor, which was subsequently crashing on invalid flags.

Also adds test for these constructor error cases, which were obviously
missing before.

Fixes #7042.
2021-05-11 22:47:14 +01:00
Linus Groh
60064e2049 LibJS: Make invalid RegExp flags a SyntaxError at parse time
This patch changes the validation of RegExp flags (checking for
invalid and duplicate values) from a SyntaxError at runtime to a
SyntaxError at parse time - it's not something that's supposed to be
catchable.
As a nice side effect, this simplifies the RegExpObject constructor a
bit, as it can no longer throw an exception and doesn't have to validate
the flags itself.
2021-05-10 12:01:38 +01:00
Ali Mohammad Pur
bf9c04a3da LibRegex: Implement multiline stateful matches 2021-04-23 10:05:04 +02:00
Idan Horowitz
6cd318d784 LibJS: Convert matched regex result to string in Symbol.replace
This would crash on an undefined match (no match), since the matched
result was assumed to be a string (such as on discord.com). The spec
suggests converting it to a string as well:
https://tc39.es/ecma262/#sec-regexp.prototype-@@replace (14#c)
2021-04-17 16:10:45 +02:00
AnotherTest
5a14f7ea2f LibRegex: Generate a 'Compare' op for empty character classes
Otherwise it would match zero-length strings.
Fixes #6256.
2021-04-12 08:54:58 +02:00
AnotherTest
1b071455b1 LibRegex: Treat brace quantifiers with invalid contents as literals
Fixes #6208.
2021-04-10 09:16:03 +02:00
AnotherTest
e9279d1790 LibRegex: Allow a '?' suffix for brace quantifiers
This fixes another compat point in #6042.
2021-04-10 09:16:03 +02:00
AnotherTest
ade97d4094 LibRegex: Make sure there are as many group matches as actual matches
Fixes #6131.
2021-04-05 09:02:06 +02:00
AnotherTest
1bdc1cf77e LibRegex: Consider named capture groups as normal capture groups too 2021-04-05 09:02:06 +02:00
AnotherTest
76f63c2980 LibRegex: Allocate entries for all capture groups in RegexResult
Not just the seen ones.
Fixes #6108.
2021-04-04 16:04:06 +02:00
Linus Groh
e46fa3ac8b LibJS: Keep RegExp.exec() results in correct order
By using regex::AllFlags::SkipTrimEmptyMatches we get a null string for
unmatched capture groups, which we then turn into an undefined entry in
the result array instead of putting all matches first and appending
undefined for the remaining number of capture groups - e.g. for

    /foo(ba((r)|(z)))/.exec("foobaz")

we now return

    ["foobaz", "baz", "z", undefined, "z"]

and not [

    ["foobaz", "baz", "z", "z", undefined]

Fixes part of #6042.

Also happens to fix selecting an element by ID using jQuery's $("#foo").
2021-04-03 16:34:34 +02:00
AnotherTest
6bbb26fdaf LibRegex: Allow references to capture groups that aren't parsed yet
This only applies to the ECMA262 parser.
This behaviour is an ECMA262-specific quirk, such references always
generate zero-length matches (even on subsequent passes).
Also adds a test in LibJS's test suite.

Fixes #6039.
2021-04-01 21:55:47 +02:00
Linus Groh
83c29bd8d7 LibJS: Don't assume match for each capture group in RegExp.prototype.exec()
This was not implementing the following part of the spec correctly:

    27. For each integer i such that i ≥ 1 and i ≤ n, do
        a. Let captureI be ith element of r's captures List.
        b. If captureI is undefined, let capturedValue be undefined.

Expecting a capture group match to exist for each of the RegExp's
capture groups would assert in Vector's operator[] if that's not the
case, for example:

    /(foo)(bar)?/.exec("foo")

Append undefined instead.

Fixes #5256.
2021-02-08 18:01:23 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00