Commit graph

11 commits

Author SHA1 Message Date
davidot
da374a82bc LibJS: Correct an include in PrimitiveString 2022-02-15 00:51:25 +00:00
Anonymous
745b998774 LibJS: Get rid of unnecessary work from canonical_numeric_index_string
The spec version of canonical_numeric_index_string is absurdly complex,
and ends up converting from a string to a number, and then back again
which is both slow and also requires a few allocations and a string
compare.

Instead this patch moves away from using Values to represent canonical
a canonical index. In most cases all we need to know is whether a
PropertyKey is an integer between 0 and 2^^32-2, which we already
compute when we construct a PropertyKey so the existing is_number()
check is sufficient.

The more expensive case is handling strings containing numbers that
don't roundtrip through string conversion. In most cases these turn
into regular string properties, but for TypedArray access these
property names are not treated as normal named properties.
TypedArrays treat these numeric properties as magic indexes that are
ignored on read and are not stored (but are evaluated) on assignment.

For that reason there's now a mode flag on canonical_numeric_index_string
so that only TypedArrays take the cost of the ToString round trip test.
In order to improve the performance of this path this patch includes
some early returns to avoid conversion in cases where we can quickly
know whether a property can round trip.
2022-02-14 21:06:49 +00:00
Andreas Kling
4b412e8fee Revert "LibJS: Get rid of unnecessary work from canonical_numeric_index_string"
This reverts commit 3a184f7841.

This broke a number of test262 tests under "TypedArrayConstructors".
The issue is that the CanonicalNumericIndexString AO should not fail
for inputs like "1.1", despite them not being integral indices.
2022-02-13 16:01:32 +01:00
Anonymous
d1cc67bbe1 LibJS: Avoid unnecessary ToObject conversion when resolving references
When performing GetValue on a primitive type we do not need to perform
the ToObject conversion as it will resolve to a property on the
prototype object.

To avoid this we skip the initial ToObject conversion on the base value
as it only serves to get the primitive's boxed prototype. We further
specialize on PrimitiveString in order to get efficient behaviour
behaviour for the direct properties.

Depending on the tests anywhere from 20 to 60%, with significant loop
overhead.
2022-02-13 14:44:36 +01:00
Andreas Kling
f290c59dd8 LibJS: Keep track of PrimitiveStrings and share them
VM now has a string cache which tracks all live PrimitiveStrings and
reuses an existing one if possible. This drastically reduces the number
of GC-allocated strings in many real-word situations.
2021-10-02 16:39:28 +02:00
Timothy Flynn
c1e99fca1a LibJS: Replace Vector<u16> usage in PrimitiveString wth Utf16String
This commit does not go out of its way to reduce copying of the string
data yet, but is a minimum set of changes to compile LibJS after making
PrimitiveString hold a Utf16String.
2021-08-10 23:07:50 +02:00
Timothy Flynn
b6ff7f4fcc LibJS: Allow PrimitiveString to be created with a UTF-16 string
PrimitiveString may currently only be created with a UTF-8 string, and
it transcodes on the fly when a UTF-16 string is needed. Allow creating
a PrimitiveString from a UTF-16 string to avoid unnecessary transcoding
when the caller only wants UTF-16.
2021-08-04 11:18:24 +02:00
Timothy Flynn
4c2cc419f9 LibJS: Decode UTF-16 surrogate pairs during string literal construction
Rather than deferring this decoding to PrimitiveString, we can decode
surrogate pairs when parsing the string. This prevents a string copy
when constructing the PrimitiveString.
2021-08-04 11:18:24 +02:00
Timothy Flynn
0c42aece36 LibJS: Transcode UTF-8 strings to UTF-16 and add UTF-16 accessors
LibJS parses JavaScript as UTF-8, so when creating a string, we must
transcode it to UTF-16 to handle encoded surrogate pairs.

For example, consider the following string:
    "\ud83d\ude00"

The UTF-8 encoding of this surrogate pair is:
    0xf0 0x9f 0x98 0x80

However, LibJS will currently store the two surrogates individually as
UTF-8 encoded bytes, rather than combining the pair:
    0xed 0xa0 0xb8, 0xed 0xb8 0x80

These are not equivalent. So, as String.prototype becomes UTF-16 aware,
this encoding will no longer work for abstractions like strict equality.
2021-07-22 09:10:44 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Renamed from Libraries/LibJS/Runtime/PrimitiveString.cpp (Browse further)