Commit graph

18 commits

Author SHA1 Message Date
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Andreas Kling
35c9aa7c05 LibJS: Hide all the constructors!
Now that the GC allocator is able to invoke Cell subclass constructors
directly via friendship, we no longer need to keep them public. :^)
2022-08-29 03:24:54 +02:00
Andreas Kling
6e973ce69b LibJS: Add JS_CELL macro and use it in all JS::Cell subclasses
This is similar to what we already had with JS_OBJECT (and also
JS_ENVIRONMENT) but sits at the top of the Cell inheritance hierarchy.
2022-08-29 03:24:54 +02:00
Linus Groh
56b2ae5ac0 LibJS: Replace GlobalObject with VM in remaining AOs [Part 19/19] 2022-08-23 13:58:30 +01:00
Linus Groh
12edbb51bc LibJS: Rename PrimitiveString::m_{left,right} to m_{lhs,rhs}
The LHS/RHS naming is already widely used as parameter names and local
variables with the same meaning, so let's also use them for the members.
2022-08-06 12:02:48 +02:00
Andreas Kling
64b29eb459 LibJS: Implement string concatenation using ropes
Instead of concatenating string data every time you add two strings
together in JavaScript, we now create a new PrimitiveString that points
to the two concatenated strings instead.

This turns concatenated strings into a tree structure that doesn't have
to be serialized until someone wants the characters in the string.

This *dramatically* reduces the peak memory footprint when running
the SunSpider benchmark (from ~6G to ~1G on my machine). It's also
significantly faster (1.39x) :^)
2022-08-06 00:29:15 +02:00
Andreas Kling
f4c68eb0a4 LibJS: Add PrimitiveString::is_empty() and use it
If we're only interested in whether the string is empty, we can skip the
conversion from UTF-16 to UTF-8.
2022-07-19 12:45:50 +02:00
Lenny Maiorani
a0367aa43b DevTools+LibJS+LibWeb: Change class_name to use StringView
This helps make the overall codebase consistent. `class_name()` in
`Kernel` is always `StringView`, but not elsewhere.

Additionally, this results in the `strlen` (which needs to be done
when printing or other operations) always being computed at
compile-time.
2022-03-19 00:20:46 +00:00
Andreas Kling
88e7d44cc4 LibJS+LibLine: Run clang-format 2022-02-13 14:55:23 +01:00
Anonymous
d1cc67bbe1 LibJS: Avoid unnecessary ToObject conversion when resolving references
When performing GetValue on a primitive type we do not need to perform
the ToObject conversion as it will resolve to a property on the
prototype object.

To avoid this we skip the initial ToObject conversion on the base value
as it only serves to get the primitive's boxed prototype. We further
specialize on PrimitiveString in order to get efficient behaviour
behaviour for the direct properties.

Depending on the tests anywhere from 20 to 60%, with significant loop
overhead.
2022-02-13 14:44:36 +01:00
Timothy Flynn
b85b8ca350 LibJS: Reduce UTF-8 to UTF-16 transcoding when only UTF-16 is wanted
When appending two strings together to form a new string, if both of the
strings are already UTF-16, create the new string as UTF-16 as well.

This shaves about 0.5 seconds off the following test262 test:
  RegExp/property-escapes/generated/General_Category_-_Decimal_Number.js
2021-08-10 23:07:50 +02:00
Timothy Flynn
c1e99fca1a LibJS: Replace Vector<u16> usage in PrimitiveString wth Utf16String
This commit does not go out of its way to reduce copying of the string
data yet, but is a minimum set of changes to compile LibJS after making
PrimitiveString hold a Utf16String.
2021-08-10 23:07:50 +02:00
Timothy Flynn
b6ff7f4fcc LibJS: Allow PrimitiveString to be created with a UTF-16 string
PrimitiveString may currently only be created with a UTF-8 string, and
it transcodes on the fly when a UTF-16 string is needed. Allow creating
a PrimitiveString from a UTF-16 string to avoid unnecessary transcoding
when the caller only wants UTF-16.
2021-08-04 11:18:24 +02:00
Timothy Flynn
0c42aece36 LibJS: Transcode UTF-8 strings to UTF-16 and add UTF-16 accessors
LibJS parses JavaScript as UTF-8, so when creating a string, we must
transcode it to UTF-16 to handle encoded surrogate pairs.

For example, consider the following string:
    "\ud83d\ude00"

The UTF-8 encoding of this surrogate pair is:
    0xf0 0x9f 0x98 0x80

However, LibJS will currently store the two surrogates individually as
UTF-8 encoded bytes, rather than combining the pair:
    0xed 0xa0 0xb8, 0xed 0xb8 0x80

These are not equivalent. So, as String.prototype becomes UTF-16 aware,
this encoding will no longer work for abstractions like strict equality.
2021-07-22 09:10:44 +02:00
Andreas Kling
6714cf3631 LibJS: Move Cell.{cpp,h} from Runtime/ to Heap/ 2021-05-17 19:53:00 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Renamed from Libraries/LibJS/Runtime/PrimitiveString.h (Browse further)