beenull/ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-26 01:20:25 +00:00

Author	SHA1	Message	Date
Timothy Flynn	1e10d6d7ce	LibRegex: Support property escapes of Unicode General Categories This changes LibRegex to parse the property escape as a Variant of Unicode Property & General Category values. A byte code instruction is added to perform matching based on General Category values.	2021-08-02 21:02:09 +04:30
Ali Mohammad Pur	85d87cbcc8	LibRegex: Add some tests for Fork{Stay,Jump} performance Without the previous fixes, these will blow up the stack.	2021-08-02 17:22:50 +04:30
Timothy Flynn	d485cf29d7	LibRegex+LibUnicode: Begin implementing Unicode property escapes This supports some binary property matching. It does not support any properties not yet parsed by LibUnicode, nor does it support value matching (such as Script_Extensions=Latin).	2021-07-30 21:26:31 +01:00
Timothy Flynn	345ef6abba	LibRegex: Support ECMA-262 Unicode escapes of the form "\u{code_point}" When the Unicode flag is set, regular expressions may escape code points by surrounding the hexadecimal code point with curly braces, e.g. \u{41} is the character "A". When the Unicode flag is not set, this should be considered a repetition symbol - \u{41} is the character "u" repeated 41 times. This is left as a TODO for now.	2021-07-23 23:06:57 +01:00
Timothy Flynn	47f6bb38a1	LibRegex: Support UTF-16 RegexStringView and improve Unicode matching When the Unicode option is not set, regular expressions should match based on code units; when it is set, they should match based on code points. To do so, the regex parser must combine surrogate pairs when the Unicode option is set. Further, RegexStringView needs to know if the flag is set in order to return code point vs. code unit based string lengths and substrings.	2021-07-23 23:06:57 +01:00
Ali Mohammad Pur	f364fcec5d	LibRegex+Everywhere: Make LibRegex more unicode-aware This commit makes LibRegex (mostly) capable of operating on any of the three main string views: - StringView for raw strings - Utf8View for utf-8 encoded strings - Utf32View for raw unicode strings As a result, regexps with unicode strings should be able to properly handle utf-8 and not stop in the middle of a code point. A future commit will update LibJS to use the correct type of string depending on the flags.	2021-07-18 21:10:55 +04:30
Ali Mohammad Pur	e5af15a6e9	LibRegex: Don't do out-of-bound match accesses when a test fails	2021-07-18 21:10:55 +04:30
Ali Mohammad Pur	1c584e9d80	LibRegex: Correctly parse BRE bracket expressions Commonly, bracket expressions are in fact, enclosed in brackets.	2021-07-10 22:58:24 +04:30
Ali Mohammad Pur	daa6d99e6e	LibRegex: Add support for non-extended regular expressions in regcomp() Fixes part of #8506.	2021-07-10 13:33:08 +02:00
Timothy Flynn	65003241e4	LibRegex: Allow dollar signs in ECMA262 named capture groups Fixes 1 test262 test.	2021-07-06 22:33:17 +01:00
sin-ack	9a9e7f03f2	Tests: Add test for case-insensitive matching	2021-06-16 16:30:12 +04:30
Andrew Kaster	55d338b66f	Tests: Free all memory allocated with regcomp in RegexLibC tests The C interface (posix interface?) for regexes has no "initialize" function, only a free function. The comment in regcomp in LibRegex/C/Regex.cpp notes that calling regcomp without a regfree is an error, and will leak memory. Every single time regcomp is called on a regex_t*, it will allocate new memory. Make sure that all the regcomp calls are paired with a regfree in the tests program	2021-05-14 08:34:00 +01:00
Brian Gianforcaro	6e918e4e02	Tests: Move LibRegex tests to Tests/LibRegex	2021-05-06 17:54:28 +02:00

1 2 3

113 commits