0ct0pu5/ladybird

Author	SHA1	Message	Date
Andreas Kling	0ad4be3d78	LibJS: Skip redundant UTF-8 validation in rope string resolution When resolving a rope, we've already taken care to resolve it to a UTF-8 byte stream. There's no need to do a separate pass just for validating the data again. This was noticeable in some profiles. I made a simple microbenchmark that gets a 30% speed-up: ("x" + "y".repeat(100_000_000)).trimStart()	2023-12-30 13:49:50 +01:00
Ali Mohammad Pur	5e1499d104	Everywhere: Rename {Deprecated => Byte}String This commit un-deprecates DeprecatedString, and repurposes it as a byte string. As the null state has already been removed, there are no other particularly hairy blockers in repurposing this type as a byte string (what it _really_ is). This commit is auto-generated: $ xs=$(ack -l \bDeprecatedString\b\\|deprecated_string AK Userland \ Meta Ports Ladybird Tests Kernel) $ perl -pie 's/\bDeprecatedString\b/ByteString/g; s/deprecated_string/byte_string/g' $xs $ clang-format --style=file -i \ $(git diff --name-only \| grep \.cpp\\|\.h) $ gn format $(git ls-files '.gn' '.gni')	2023-12-17 18:25:10 +03:30
Andreas Kling	eda2a6d9f7	LibJS: Don't die when making PrimitiveString from "" DeprecatedFlyString	2023-11-29 09:48:18 +01:00
Andreas Kling	3c74dc9f4d	LibJS: Segregate GC-allocated objects by type This patch adds two macros to declare per-type allocators: - JS_DECLARE_ALLOCATOR(TypeName) - JS_DEFINE_ALLOCATOR(TypeName) When used, they add a type-specific CellAllocator that the Heap will delegate allocation requests to. The result of this is that GC objects of the same type always end up within the same HeapBlock, drastically reducing the ability to perform type confusion attacks. It also improves HeapBlock utilization, since each block now has cells sized exactly to the type used within that block. (Previously we only had a handful of block sizes available, and most GC allocations ended up with a large amount of slack in their tails.) There is a small performance hit from this, but I'm sure we can make up for it elsewhere. Note that the old size-based allocators still exist, and we fall back to them for any type that doesn't have its own CellAllocator.	2023-11-19 12:10:31 +01:00
Ali Mohammad Pur	aeee98b3a1	AK+Everywhere: Remove the null state of DeprecatedString This commit removes DeprecatedString's "null" state, and replaces all its users with one of the following: - A normal, empty DeprecatedString - Optional<DeprecatedString> Note that null states of DeprecatedFlyString/StringView/etc are not affected by this commit. However, DeprecatedString::empty() is now considered equal to a null StringView.	2023-10-13 18:33:21 +03:30
Timothy Flynn	573cbb5ca0	LibJS+LibWeb+WebContent: Stop using ThrowableStringBuilder	2023-09-09 13:03:25 -04:00
Andreas Kling	b8f78c0adc	LibJS: Make JS::number_to_string() infallible Work towards #20449.	2023-08-09 17:09:16 +02:00
Andreas Kling	09547ec975	LibJS: Make PrimitiveString::deprecated_string() infallible Work towards #20449.	2023-08-09 17:09:16 +02:00
Andreas Kling	c084269e5f	LibJS: Make PrimitiveString::utf8_string() infallible Work towards #20449.	2023-08-09 17:09:16 +02:00
Andreas Kling	7849950383	LibJS: Make Utf16String & related APIs infallible Work towards #20449.	2023-08-09 17:09:16 +02:00
Andreas Kling	9708b86d65	LibJS: Make PrimitiveString::resolve_rope_if_needed() infallible Work towards #20449.	2023-08-09 17:09:16 +02:00
Andreas Kling	1a27c525d5	LibJS: Make PrimitiveString::create() infallible Work towards #20449.	2023-08-09 17:09:16 +02:00
Andreas Kling	a3e4535f34	LibJS: Resolve rope strings directly to UTF-16 when preferable When someone calls PrimitiveString::utf16_string() on a rope string, we know for sure that the client wants a UTF-16 string and may not be interested in a UTF-8 version at all. To avoid round-tripping through UTF-8 in this scenario, callers can now inform resolve_rope_if_needed() about their preferred encoding, should rope resolution take place. The UTF-16 case is actually a lot simpler than the UTF-8 case, since we can simply ask for UTF-16 data for each fiber of the rope, and then concatenate all the fibers. Since LibJS always uses UTF-16 for regular expression matching, this avoids round-tripping through UTF-8 whenever the input to a regex test is already UTF-16. :^)	2023-07-13 20:53:54 +02:00
Hendiadyoin1	9300b9a364	LibJS: Don't lie about m_deprecated_string being a StringView	2023-06-13 01:49:02 +02:00
Matthew Olsson	82eeee2008	LibJS+LibWeb: Normalize calls to Base::visit_edges in GC objects	2023-04-30 06:04:33 +02:00
Timothy Flynn	0d0b87fd46	LibJS: Add a PrimitiveString::create overload for FlyString This is to disambiguate this type from the StringView overload.	2023-03-18 19:50:45 +01:00
Timothy Flynn	36d72a7f4c	LibJS: Convert CanonicalNumericIndexString to use NumberToString	2023-02-16 14:32:22 +01:00
Timothy Flynn	c3abb1396c	LibJS+LibWeb: Convert string view PrimitiveString instances to String First, this adds an overload of PrimitiveString::create for StringView. This overload will throw an OOM completion if creating a String fails. This is not only a bit more convenient, but it also ensures at compile time that all PrimitiveString::create(string_view) invocations will be handled as String and OOM-aware. Next, this wraps all invocations to PrimitiveString::create(string_view) with MUST_OR_THROW_OOM. A small PrimitiveString::create(DeprecatedFlyString) overload also had to be added to disambiguate between the StringView and DeprecatedString overloads.	2023-02-09 17:13:33 +00:00
Timothy Flynn	4235c59397	LibJS: Add a convenience StringView accessor to PrimitiveString	2023-01-16 10:12:37 +00:00
Timothy Flynn	46dd8c1c0b	LibJS: Resolve all UTF-8 rope strings as a String	2023-01-15 01:00:20 +00:00
Timothy Flynn	8f5bdce8e7	LibJS: Add initial support for creating PrimitiveStrings with AK::String This will temporarily bloat the size of PrimitiveString as LibJS is transitioned to use String throughout, but will make doing so piecemeal much easier.	2023-01-15 01:00:20 +00:00
Timothy Flynn	4eb5eb2080	LibJS: Rename Utf16String::to_utf8 to to_deprecated_string	2023-01-15 01:00:20 +00:00
Timothy Flynn	ca655f5e7d	LibJS: Rename VM::string_cache to deprecated_string_cache And rename the member variable from m_string_cache to m_deprecated_string_cache to match.	2023-01-15 01:00:20 +00:00
Timothy Flynn	3a004e8f1a	LibJS: Rename PrimitiveString::has_utf8_string to has_deprecated_string And rename the member variable from m_utf8_string to m_deprecated_string to match.	2023-01-15 01:00:20 +00:00
Timothy Flynn	a59ebdac2d	LibJS+Everywhere: Return strings by value from PrimitiveString It turns out return a ThrowCompletionOr<T const&> is flawed, as the GCC expansion trick used with TRY will always make a copy. PrimitiveString is luckily the only such use case.	2023-01-13 18:50:47 -05:00
Timothy Flynn	6e1a239a62	LibJS: Use fallible methods to handle OOM when resolving rope strings	2023-01-08 12:13:15 +01:00
Timothy Flynn	115baa7e32	LibJS+Everywhere: Make PrimitiveString and Utf16String fallible This makes construction of Utf16String fallible in OOM conditions. The immediate impact is that PrimitiveString must then be fallible as well, as it may either transcode UTF-8 to UTF-16, or create a UTF-16 string from ropes. There are a couple of places where it is very non-trivial to propagate the error further. A FIXME has been added to those locations.	2023-01-08 12:13:15 +01:00
Timothy Flynn	d793262beb	AK+Everywhere: Make UTF-16 to UTF-8 converter fallible This could fail to allocate the underlying storage needed to store the UTF-8 data. Propagate this error.	2023-01-08 12:13:15 +01:00
Timothy Flynn	425c168ded	AK+LibJS+LibRegex: Define an alias for UTF-16 string data storage Instead of writing out "Vector<u16, 1>" everywhere, let's have a name for it.	2023-01-08 12:13:15 +01:00
Linus Groh	22089436ed	LibJS: Convert Heap::allocate{,_without_realm}() to NonnullGCPtr	2022-12-15 06:56:37 -05:00
Linus Groh	525f22d018	LibJS: Replace standalone js_string() with PrimitiveString::create() Note that js_rope_string() has been folded into this, the old name was misleading - it would not always create a rope string, only if both sides are not empty strings. Use a three-argument create() overload instead.	2022-12-07 16:43:06 +00:00
Linus Groh	57dc179b1f	Everywhere: Rename to_{string => deprecated_string}() where applicable This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.	2022-12-06 08:54:33 +01:00
Linus Groh	6e19ab2bbc	AK+Everywhere: Rename String to DeprecatedString We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)	2022-12-06 08:54:33 +01:00
Andreas Kling	71067cbc6c	LibJS+LibWeb: Make Runtime/AbstractOperations.h not include AST.h This led to considerable fallout and many files had to be patched with now-missing include statements.	2022-11-23 16:05:59 +00:00
Linus Groh	56b2ae5ac0	LibJS: Replace GlobalObject with VM in remaining AOs [Part 19/19]	2022-08-23 13:58:30 +01:00
Linus Groh	e992a9f469	LibJS+LibWeb: Replace GlobalObject with Realm in Heap::allocate<T>() This is a continuation of the previous three commits. Now that create() receives the allocating realm, we can simply forward that to allocate(), which accounts for the majority of these changes. Additionally, we can get rid of the realm_from_global_object() in one place, with one more remaining in VM::throw_completion().	2022-08-23 13:58:30 +01:00
Linus Groh	12edbb51bc	LibJS: Rename PrimitiveString::m_{left,right} to m_{lhs,rhs} The LHS/RHS naming is already widely used as parameter names and local variables with the same meaning, so let's also use them for the members.	2022-08-06 12:02:48 +02:00
Andreas Kling	64b29eb459	LibJS: Implement string concatenation using ropes Instead of concatenating string data every time you add two strings together in JavaScript, we now create a new PrimitiveString that points to the two concatenated strings instead. This turns concatenated strings into a tree structure that doesn't have to be serialized until someone wants the characters in the string. This dramatically reduces the peak memory footprint when running the SunSpider benchmark (from ~6G to ~1G on my machine). It's also significantly faster (1.39x) :^)	2022-08-06 00:29:15 +02:00
Andreas Kling	f4c68eb0a4	LibJS: Add PrimitiveString::is_empty() and use it If we're only interested in whether the string is empty, we can skip the conversion from UTF-16 to UTF-8.	2022-07-19 12:45:50 +02:00
davidot	da374a82bc	LibJS: Correct an include in PrimitiveString	2022-02-15 00:51:25 +00:00
Anonymous	745b998774	LibJS: Get rid of unnecessary work from canonical_numeric_index_string The spec version of canonical_numeric_index_string is absurdly complex, and ends up converting from a string to a number, and then back again which is both slow and also requires a few allocations and a string compare. Instead this patch moves away from using Values to represent canonical a canonical index. In most cases all we need to know is whether a PropertyKey is an integer between 0 and 2^^32-2, which we already compute when we construct a PropertyKey so the existing is_number() check is sufficient. The more expensive case is handling strings containing numbers that don't roundtrip through string conversion. In most cases these turn into regular string properties, but for TypedArray access these property names are not treated as normal named properties. TypedArrays treat these numeric properties as magic indexes that are ignored on read and are not stored (but are evaluated) on assignment. For that reason there's now a mode flag on canonical_numeric_index_string so that only TypedArrays take the cost of the ToString round trip test. In order to improve the performance of this path this patch includes some early returns to avoid conversion in cases where we can quickly know whether a property can round trip.	2022-02-14 21:06:49 +00:00
Andreas Kling	4b412e8fee	Revert "LibJS: Get rid of unnecessary work from canonical_numeric_index_string" This reverts commit `3a184f7841`. This broke a number of test262 tests under "TypedArrayConstructors". The issue is that the CanonicalNumericIndexString AO should not fail for inputs like "1.1", despite them not being integral indices.	2022-02-13 16:01:32 +01:00
Anonymous	d1cc67bbe1	LibJS: Avoid unnecessary ToObject conversion when resolving references When performing GetValue on a primitive type we do not need to perform the ToObject conversion as it will resolve to a property on the prototype object. To avoid this we skip the initial ToObject conversion on the base value as it only serves to get the primitive's boxed prototype. We further specialize on PrimitiveString in order to get efficient behaviour behaviour for the direct properties. Depending on the tests anywhere from 20 to 60%, with significant loop overhead.	2022-02-13 14:44:36 +01:00
Andreas Kling	f290c59dd8	LibJS: Keep track of PrimitiveStrings and share them VM now has a string cache which tracks all live PrimitiveStrings and reuses an existing one if possible. This drastically reduces the number of GC-allocated strings in many real-word situations.	2021-10-02 16:39:28 +02:00
Timothy Flynn	c1e99fca1a	LibJS: Replace Vector<u16> usage in PrimitiveString wth Utf16String This commit does not go out of its way to reduce copying of the string data yet, but is a minimum set of changes to compile LibJS after making PrimitiveString hold a Utf16String.	2021-08-10 23:07:50 +02:00
Timothy Flynn	b6ff7f4fcc	LibJS: Allow PrimitiveString to be created with a UTF-16 string PrimitiveString may currently only be created with a UTF-8 string, and it transcodes on the fly when a UTF-16 string is needed. Allow creating a PrimitiveString from a UTF-16 string to avoid unnecessary transcoding when the caller only wants UTF-16.	2021-08-04 11:18:24 +02:00
Timothy Flynn	4c2cc419f9	LibJS: Decode UTF-16 surrogate pairs during string literal construction Rather than deferring this decoding to PrimitiveString, we can decode surrogate pairs when parsing the string. This prevents a string copy when constructing the PrimitiveString.	2021-08-04 11:18:24 +02:00
Timothy Flynn	0c42aece36	LibJS: Transcode UTF-8 strings to UTF-16 and add UTF-16 accessors LibJS parses JavaScript as UTF-8, so when creating a string, we must transcode it to UTF-16 to handle encoded surrogate pairs. For example, consider the following string: "\ud83d\ude00" The UTF-8 encoding of this surrogate pair is: 0xf0 0x9f 0x98 0x80 However, LibJS will currently store the two surrogates individually as UTF-8 encoded bytes, rather than combining the pair: 0xed 0xa0 0xb8, 0xed 0xb8 0x80 These are not equivalent. So, as String.prototype becomes UTF-16 aware, this encoding will no longer work for abstractions like strict equality.	2021-07-22 09:10:44 +02:00
Brian Gianforcaro	1682f0b760	Everything: Move to SPDX license identifiers in all files. SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *	2021-04-22 11:22:27 +02:00
Andreas Kling	13d7c09125	Libraries: Move to Userland/Libraries/	2021-01-12 12:17:46 +01:00

50 commits