This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
We were mistakenly trying to append UTF-16 code units to a StringBuilder
via the append(char) API. This patch fixes that by accumulating the
result in a Vector<u16> instead.
This'll be a bit worse for performance, since we're now doing additional
UTF-16 string conversions, but we're going for correctness at this stage
and can worry about performance later.
This is only visible with something like `Object.getOwnPropertyNames` on
the global object. All other declaration instantiations put the
functions on an environment making the order invisible.
Note that spec order is not quite tree order as in non-strict mode
functions which get hoisted out of blocks appear before top level
functions.
Co-authored-by: Hendiadyoin1 <leon.a@serenityos.org>
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
This is needed so that the allocated NativeFunction receives the correct
realm, usually forwarded from the Object's initialize() function, rather
than using the current realm.
This is a continuation of the previous five commits.
A first big step into the direction of no longer having to pass a realm
(or currently, a global object) trough layers upon layers of AOs!
Unlike the create() APIs we can safely assume that this is only ever
called when a running execution context and therefore current realm
exists. If not, you can always manually allocate the Error and put it in
a Completion :^)
In the spec, throw exceptions implicitly use the current realm's
intrinsics as well: https://tc39.es/ecma262/#sec-throw-an-exception
This is a continuation of the previous four commits.
Passing a global object here is largely redundant, we definitely need
the interpreter but can get the VM and (later) current active realm from
there - and also the global object while we still need it, although I'd
like to remove Interpreter::global_object() in the future.
This now matches the bytecode interpreter's execute_impl() functions.
This is a continuation of the previous three commits.
Now that create() receives the allocating realm, we can simply forward
that to allocate(), which accounts for the majority of these changes.
Additionally, we can get rid of the realm_from_global_object() in one
place, with one more remaining in VM::throw_completion().
This is a continuation of the previous two commits.
As allocating a JS cell already primarily involves a realm instead of a
global object, and we'll need to pass one to the allocate() function
itself eventually (it's bridged via the global object right now), the
create() functions need to receive a realm as well.
The plan is for this to be the highest-level function that actually
receives a realm and passes it around, AOs on an even higher level will
use the "current realm" concept via VM::current_realm() as that's what
the spec assumes; passing around realms (or global objects, for that
matter) on higher AO levels is pointless and unlike for allocating
individual objects, which may happen outside of regular JS execution, we
don't need control over the specific realm that is being used there.
No functional changes - we can still very easily get to the global
object via `Realm::global_object()`. This is in preparation of moving
the intrinsics to the realm and no longer having to pass a global
object when allocating any object.
In a few (now, and many more in subsequent commits) places we get a
realm using `GlobalObject::associated_realm()`, this is intended to be
temporary. For example, create() functions will later receive the same
treatment and are passed a realm instead of a global object.
While adding spec comments to PerformEval, I noticed we were missing
multiple steps.
Namely, these were:
- Checking if the host will allow us to compile the string
(allowing LibWeb to perform CSP for eval)
- The parser's initial state depending on the environment around us
on direct eval:
- Allowing new.target via eval in functions
- Allowing super calls and super properties via eval in classes
- Disallowing the use of the arguments object in class field
initializers at eval's parse time
- Setting ScriptOrModule of eval's execution context
The spec allows us to apply the additional parsing steps in any order.
The method I have gone with is passing in a struct to the parser's
constructor, which overrides the parser's initial state to (dis)allow
the things stated above from the get-go.
The spec version of canonical_numeric_index_string is absurdly complex,
and ends up converting from a string to a number, and then back again
which is both slow and also requires a few allocations and a string
compare.
Instead this patch moves away from using Values to represent canonical
a canonical index. In most cases all we need to know is whether a
PropertyKey is an integer between 0 and 2^^32-2, which we already
compute when we construct a PropertyKey so the existing is_number()
check is sufficient.
The more expensive case is handling strings containing numbers that
don't roundtrip through string conversion. In most cases these turn
into regular string properties, but for TypedArray access these
property names are not treated as normal named properties.
TypedArrays treat these numeric properties as magic indexes that are
ignored on read and are not stored (but are evaluated) on assignment.
For that reason there's now a mode flag on canonical_numeric_index_string
so that only TypedArrays take the cost of the ToString round trip test.
In order to improve the performance of this path this patch includes
some early returns to avoid conversion in cases where we can quickly
know whether a property can round trip.
This reverts commit 3a184f7841.
This broke a number of test262 tests under "TypedArrayConstructors".
The issue is that the CanonicalNumericIndexString AO should not fail
for inputs like "1.1", despite them not being integral indices.
Instead of crashing on the spot, return a descriptive error that will
eventually continue its days as a javascript "InternalError" exception.
This should make random crashes with BC less likely.
The spec version of canonical_numeric_index_string is absurdly complex,
and ends up converting from a string to a number, and then back again
which is both slow and also requires a few allocations and a string
compare.
Instead lets use the logic we already have as that is much more
efficient.
This improves performance of all non-numeric property names.