Three standalone Cell creation functions remain in the JS namespace:
- js_bigint()
- js_string()
- js_symbol()
All of them are leftovers from early iterations when LibJS still took
inspiration from JSC, which itself has jsString(). Nowadays, we pretty
much exclusively use static create() functions to construct types
allocated on the JS heap, and there's no reason to not do the same for
these.
Also change the return type from BigInt* to NonnullGCPtr<BigInt> while
we're here.
This is patch 1/3, replacement of js_string() and js_symbol() follow.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Rename it to match the name used by the spec.
Add an override mode to skip formatting numbers with an exponential sign
(e.g. 1e23). This mode is needed by Number and Intl.NumberFormat, who
don't call out a specific number-to-string method to use (they just say
to make "the String consisting of the digits of n").
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
Although this is much more complicated it does not seem to impact
performance that much even when looking only at values in which the
previous casting to i32 was correct.
For the fuzzer build isnan was not usable in a constexpr context however
__builtin_isnan seems to always be.
Also while we're here add my name to the copyright since I forgot after
the Value rewrite.
This is a continuation of the previous five commits.
A first big step into the direction of no longer having to pass a realm
(or currently, a global object) trough layers upon layers of AOs!
Unlike the create() APIs we can safely assume that this is only ever
called when a running execution context and therefore current realm
exists. If not, you can always manually allocate the Error and put it in
a Completion :^)
In the spec, throw exceptions implicitly use the current realm's
intrinsics as well: https://tc39.es/ecma262/#sec-throw-an-exception
This is a continuation of the previous two commits.
As allocating a JS cell already primarily involves a realm instead of a
global object, and we'll need to pass one to the allocate() function
itself eventually (it's bridged via the global object right now), the
create() functions need to receive a realm as well.
The plan is for this to be the highest-level function that actually
receives a realm and passes it around, AOs on an even higher level will
use the "current realm" concept via VM::current_realm() as that's what
the spec assumes; passing around realms (or global objects, for that
matter) on higher AO levels is pointless and unlike for allocating
individual objects, which may happen outside of regular JS execution, we
don't need control over the specific realm that is being used there.
We use strtod to convert a string to number after checking whether the
string is [+-]Infinity, however strtod also checks for either 'inf' or
'infinity' in a case-insensitive.
There are still valid cases for strtod to return infinity like 10e100000
so we just check if the "number" contains 'i' or 'I' in which case
the strtod infinity is not valid.
Using the fact that there are 2^52-2 NaN representations we can
"NaN-box" all the Values possible. This means that Value no longer has
an explicit "Type" but that information is now stored in the bits of a
double. This is done by "tagging" the top two bytes of the double.
For a full explanation see the large comment with asserts at the top of
Value.
We can also use the exact representation of the tags to make checking
properties like nullish, or is_cell quicker. But the largest gains are
in the fact that the size of a Value is now halved.
The SunSpider and other benchmarks have been ran to confirm that there
are no regressions in performance compared to the previous
implementation. The tests never performed worse and in some cases
performed better. But the biggest differences can be seen in memory
usage when large arrays are allocated. A simple test which allocates a
1000 arrays of size 100000 has roughly half the memory usage.
There is also space in the representations for future expansions such as
tuples and records.
To ensure that Values on the stack and registers are not lost during
garbage collection we also have to add a check to the Heap to check for
any of the cell tags and extracting the canonical form of the pointer
if it matches.
Instead of concatenating string data every time you add two strings
together in JavaScript, we now create a new PrimitiveString that points
to the two concatenated strings instead.
This turns concatenated strings into a tree structure that doesn't have
to be serialized until someone wants the characters in the string.
This *dramatically* reduces the peak memory footprint when running
the SunSpider benchmark (from ~6G to ~1G on my machine). It's also
significantly faster (1.39x) :^)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
This change updates the parameter order of the is_less_than function
signature and calls to match accordingly with the spec
(https://tc39.es/ecma262/#sec-islessthan)
In many cases we already know a certain value is a number, or don't have
JS values at all and would need to wrap doubles in a value. To optimize
these cases and avoid having to pass a global object into functions that
won't ever allocate or throw, add a standalone implementation of this
function that takes and returns doubles directly.
The JS behaviour of exponentiation on two number typed values is
not a simple matter of forwarding to ::pow(double, double). So,
this factors out the Math.pow logic to allow it to be shared with
Value::exp.
The ECMA verbiage for modulus is the mathematical definition implemented
by fmod, so let's just use that rather than trying to reimplement all
the edge cases.
The spec defines a StringToBigInt AO which allows for converting binary,
octal, decimal, and hexadecimal strings to a BigInt. Our conversion was
only allowing for decimal strings.
Bitwise operators are defined on two's complement, but SignedBitInteger
uses sign-magnitude. Correctly convert between the two.
Let LibJS delegate to SignedBitInteger for bitwise_not, like it does
for all other bitwise_ operations on bigints.
No behavior change (LibJS is now the only client of
SignedBitInteger::bitwise_not()).
Performance of string concatenation regressed in a57e2f9. That commit
iterates over the LHS string to find the last code unit, to check if it
is a high surrogate. Instead, first look at the 3rd-to-last byte in the
UTF-8 encoded string to check if it is a 3-byte code point; then decode
just those bytes to check if we have a high surrogate. Similarly, check
the first 3 bytes of the RHS string to check if we have a low surrogate.
In the following use case:
"\ud834" + "\udf06"
We were previously combining these as two individual code points. When
concatenating strings, we must take care to combine the high surrogate
from the left-hand side with the low surrogate from the right-hand side.
If the Value is a non-negative Int32, create a numeric PropertyKey
instead of making a string key.
This makes "ai-astar" test from the Kraken benchmark run in 30 seconds,
down from 42 seconds. :^)
Instead of returning JS::StringOrSymbol, which is a space-optimized type
used in Shape property tables, this now returns JS::PropertyKey which is
*not* space-optimized, but has other niceties like optimized storage of
numeric ("indexed") properties.