We should pay more attention to using the well-defined abstract
operations from the spec rather than making up our own, often slightly
different rules. This is another step in that direction.
We have multiple array types now, so ArrayInvalidLength has been
replaced with a generic InvalidLength.
Also fixes a small issue in the Array constructor, it should throw
RangeError for invalid lengths, not TypeError.
- Calling without 'new' is an error
- If the first argument is an object, we need a separate code path to
initialize from TypedArray, ArrayBuffer, Iterable or Array-like
object (marked TODO for now)
- Don't insert values into array if more than one argument is present
(that's not part of the spec)
The current implementation is not entirely correct yet. Two classes have
been added:
- TypedArrayConstructor, which the various typed array constructors now
inherit from. Calling or constructing this class (from JS, that is)
directly is not possible, we might want to move this abstract class
functionality to NativeFunction at a later point.
- TypedArrayPrototype, which the various typed array prototypes now have
as their own prototype. This will be the place where most of the
functionality is being shared.
Relevant parts from the spec:
22.2.1 The %TypedArray% Intrinsic Object
The %TypedArray% intrinsic object:
- is a constructor function object that all of the TypedArray
constructor objects inherit from.
- along with its corresponding prototype object, provides common
properties that are inherited by all TypedArray constructors and their
instances.
22.2.2 Properties of the %TypedArray% Intrinsic Object
The %TypedArray% intrinsic object:
- has a [[Prototype]] internal slot whose value is %Function.prototype%.
22.2.2.3 %TypedArray%.prototype
The initial value of %TypedArray%.prototype is the %TypedArray%
prototype object.
22.2.6 Properties of the TypedArray Constructors
Each TypedArray constructor:
- has a [[Prototype]] internal slot whose value is %TypedArray%.
22.2.6.2 TypedArray.prototype
The initial value of TypedArray.prototype is the corresponding
TypedArray prototype intrinsic object (22.2.7).
22.2.7 Properties of the TypedArray Prototype Objects
Each TypedArray prototype object:
- has a [[Prototype]] internal slot whose value is %TypedArray.prototype%.
22.2.7.2 TypedArray.prototype.constructor
The initial value of a TypedArray.prototype.constructor is the
corresponding %TypedArray% intrinsic object.
This patch adds six of the standard type arrays and tries to share as
much code as possible:
- Uint8Array
- Uint16Array
- Uint32Array
- Int8Array
- Int16Array
- Int32Array
Proxy is an "exotic object" and doesn't have its own prototype. Use the
regular object prototype instead, but most stuff is happening on the
target object anyway. :^)
Instead of hiding JS exceptions raised on the web, we now print them to
the debug log. This will make it a bit easier to work out why some web
pages aren't working right. :^)
with statements evaluate an expression and put the result of it at the
"front" of the scope chain. This is implemented by creating a WithScope
object and placing it in front of the VM's current call frame's scope.
Both GlobalObject and LexicalEnvironment now inherit from ScopeObject,
and the VM's call frames point to a ScopeObject chain rather than just
a LexicalEnvironment chain.
This gives us much more flexibility to implement things like "with",
and also unifies some of the code paths that previously required
special handling of the global object.
There's a bunch of more cleanup that can be done in the wake of this
change, and there might be some oversights in the handling of the
"super" keyword, but this generally seems like a good architectural
improvement. :^)
Taking non-const cell pointers is asking for trouble, since passing e.g
a "const Object*" to Value(Object*) will actually call Value(bool),
which is most likely not what you want.
It would be nice to be able to cache some shapes globally in the VM,
but then they can't be tied to a specific global object. So let's just
get rid of the requirement that shapes are tied to a global object.
This should be using the individual flag boolean properties rather than
the [[OriginalFlags]] internal slot.
Use an enumerator macro here for brevity, this will be useful for other
things as well. :^)
- Default values should depend on arguments being undefined, not being
missing
- "(?:)" for empty pattern happens in RegExp.prototype.source, not the
constructor
This makes RegExpObject compile and store a Regex<ECMA262>, adds
all flag-related properties, and implements `RegExpPrototype.test()`
(complete with 'lastIndex' support) :^)
It should be noted that this only implements `test()' using the builtin
`exec()'.
If a receiver is given, e.g. via Reflect.get/set(), forward it to the
target object's get()/put() or use it as last argument of the trap
function. The default value is the Proxy object itself.
Passing in a plain Value and expecting it to be a native property is
error prone, let's use a more narrow type and pass a NativeProperty
reference directly.
We can't just to_string() the PropertyName, it might be a symbol.
Instead to_value() it and then use to_string_without_side_effects() as
usual.
Fixes#4062.
This prevents stack overflows when calling infinite/deep recursive
functions, e.g.:
const f = () => f(); f();
JSON.stringify({}, () => ({ foo: "bar" }));
new Proxy({}, { get: (_, __, p) => p.foo }).foo;
The VM caches a StackInfo object to not slow down function calls
considerably. VM::push_call_frame() will throw an exception if
necessary (plain Error with "RuntimeError" as its .name).
Keeping the VM call frames in a Vector could cause them to move around
underneath us due to Vector resizing. Avoid this issue by allocating
CallFrame objects on the stack and having the VM simply keep a list
of pointers to each CallFrame, instead of the CallFrames themselves.
Fixes#3830.
Fixes#3951.
As the global object is constructed and initialized in a different way
than most other objects we were not setting its prototype! This made
things like "globalThis.toString()" fail unexpectedly.
If value.to_string() throws an exception and returns a null string we
must create an invalid StringOrSymbol, not one from the null string
(which ASSERT()s).
Some things, like (the non-generic version of) Array.prototype.pop(),
check is_empty() to determine whether an action, like removing elements,
can be performed. We need to know the array-like size for that, not the
size of the underlying storage, which can be different - and is not
something IndexedProperties should expose so I removed its size().
Fixes#3948.
- We have to check if the property name is a string before calling
as_string() on it
- We can't as_number() the same property name but have to use the parsed
index number
Fixes#3950.
We can't assume that property names can be converted to strings anymore,
as we have symbols. Use name.to_value() instead.
This makes something like this possible:
new Proxy(Object, { get(t, p) { return t[p] } })[Symbol.hasInstance]
This was probably a result of search & replace, it's quite ridiculous in
some places. Let use the existing pattern of getting a reference to the
VM once at each function start consistently.
This fixes Array.prototype.{join,toString}() crashing with arrays
containing themselves, i.e. circular references.
The spec is suspiciously silent about this, and indeed engine262, a
"100% spec compliant" ECMA-262 implementation, can't handle these cases.
I had a look at some major engines instead and they all seem to keep
track or check for circular references and return an empty string for
already seen objects.
- SpiderMonkey: "AutoCycleDetector detector(cx, obj)"
- V8: "CycleProtectedArrayJoin<JSArray>(...)"
- JavaScriptCore: "StringRecursionChecker checker(globalObject, thisObject)"
- ChakraCore: "scriptContext->CheckObject(thisArg)"
To keep things simple & consistent this uses the same pattern as
JSONObject, MarkupGenerator and js: simply putting each seen object in a
HashTable<Object*>.
Fixes#3929.
This renames Object::to_primitive() to Object::ordinary_to_primitive()
for two reasons:
- No confusion with Value::to_primitive()
- To match the spec's name
Also change existing uses of Object::to_primitive() to
Value::to_primitive() when the spec uses the latter (which will still
call Object::ordinary_to_primitive()). Object::to_string() has been
removed as it's not needed anymore (and nothing the spec uses).
This makes it possible to overwrite an object's toString and valueOf and
have them provide results for anything that uses to_primitive() - e.g.:
const o = { toString: undefined, valueOf: () => 42 };
Number(o) // 42, previously NaN
["foo", o].toString(); // "foo,42", previously "foo,[object Object]"
++o // 43, previously NaN
etc.
This should not just inherit Object.prototype.toString() (and override
Object::to_string()) but be its own function, i.e.
'RegExp.prototype.toString !== Object.prototype.toString'.
When value.to_string() throws an exception it returns a null string in
which case we must not construct a valid PropertyName.
Also ASSERT in PropertyName(String) and PropertyName(FlyString) to
prevent this from happening in the future.
Fixes#3941.
We must *never* call some method that expects a non-empty value on the
result of a function call without checking for exceptions first. It
won't work reliably.
Fixes#3939.
Two issues:
- throw_exception() with ErrorType::InstanceOfOperatorBadPrototype would
receive rhs_prototype.to_string_without_side_effects(), which would
ASSERT_NOT_REACHED() as to_string_without_side_effects() must not be
called on an empty value. It should (and now does) receive the RHS
value instead as the message is "'prototype' property of {} is not an
object".
- Value::instance_of() was missing an exception check after calling
has_instance_method, to_boolean() on an empty value result would crash
as well.
Fixes#3930.
This adds a new MetaProperty AST node which will be used for
'new.target' and 'import.meta' meta properties. The parser now
distinguishes between "in function context" and "in arrow function
context" (which is required for this).
When encountering TokenType::New we will attempt to parse it as meta
property and resort to regular new expression parsing if that fails,
much like the parsing of labelled statements.
ES 5(.1) described parsing of the function body string as:
https://www.ecma-international.org/ecma-262/5.1/#sec-15.3.2.1
7. If P is not parsable as a FormalParameterList[opt] then throw a SyntaxError exception.
8. If body is not parsable as FunctionBody then throw a SyntaxError exception.
We implemented it as building the source string of a complete function
and feeding that to the parser, with the same outcome. ES 2015+ does
exactly that, but with newlines at certain positions:
https://tc39.es/ecma262/#sec-createdynamicfunction
16. Let bodyString be the string-concatenation of 0x000A (LINE FEED), ? ToString(bodyArg), and 0x000A (LINE FEED).
17. Let prefix be the prefix associated with kind in Table 49.
18. Let sourceString be the string-concatenation of prefix, " anonymous(", P, 0x000A (LINE FEED), ") {", bodyString, and "}".
This patch updates the generated source string to match these
requirements. This will make certain edge cases work, e.g.
'new Function("-->")', where the user supplied input must be placed on
its own line to be valid syntax.
A large number of JS strings are a single ASCII character. This patch
adds a 128-entry cache for those strings to the VM. The cost of the
cache is 1536 byte of GC heap (all in same block) + 2304 bytes malloc.
This avoids a lot of GC heap allocations, and packing all of these
in the same heap block is nice for fragmentation as well.
This provides a huge speed-up for objects with large numbers as property
keys in some situation. Previously we would simply iterate from 0-<max>
and check if there's a non-empty value at each index - now we're being
smarter and compute a list of non-empty indices upfront, by checking
each value in the packed elements vector and appending the sparse
elements hashmap keys (for GenericIndexedPropertyStorage).
Consider this example, an object with a single own property, which is a
number increasing by a factor of 10 each iteration:
for (let i = 0; i < 10; ++i) {
const o = {[10 ** i]: "foo"};
const start = Date.now();
Object.getOwnPropertyNames(o); // <-- IndexedPropertyIterator
const end = Date.now();
console.log(`${10 ** i} -> ${(end - start) / 1000}s`);
}
Before this change:
1 -> 0.0000s
10 -> 0.0000s
100 -> 0.0000s
1000 -> 0.0000s
10000 -> 0.0005s
100000 -> 0.0039s
1000000 -> 0.0295s
10000000 -> 0.2489s
100000000 -> 2.4758s
1000000000 -> 25.5669s
After this change:
1 -> 0.0000s
10 -> 0.0000s
100 -> 0.0000s
1000 -> 0.0000s
10000 -> 0.0000s
100000 -> 0.0000s
1000000 -> 0.0000s
10000000 -> 0.0000s
100000000 -> 0.0000s
1000000000 -> 0.0000s
Fixes#3805.
Instead of performing a prototype transition for every new object we
create via {}, prebake the object returned by Object::create_empty()
with a shape with ObjectPrototype as the prototype.
We also prebake the shape for the object assigned to the "prototype"
property of new ScriptFunction objects, since those are extremely
common and that code broke from this change anyway.
This avoid a large number of transitions and is a small speed-up on
test-js.
This is not actually necessary, since no GC allocations are made during
this process. If we ever make property tables into heap cells, we'd
have to rethink this.
This assumption only works for the m_packed_elements Vector where a
missing value at a certain index still returns an empty value, but not
for the m_sparse_elements HashMap, which is being used for indices
>= 200 - in that case the Optional<ValueAndAttributes> result will not
have a value.
This fixes a crash in the js REPL where printing an array with a hole at
any index >= 200 would crash.
When we're initializing objects, we're just adding a bunch of new
properties, without transition, and without overlap (we never add
the same property twice.)
Take advantage of this by skipping lookups entirely (no need to see
if we're overwriting an existing property) during initialization.
Another nice test-js speedup :^)
Roughly 7% of test-js runtime was spent creating FlyStrings from string
literals. This patch frontloads that work and caches all the commonly
used names in LibJS on a CommonPropertyNames struct that hangs off VM.
When changing the attributes of an existing property of an object with
unique shape we must not change the PropertyMetadata offset.
Doing so without resizing the underlying storage vector caused an OOB
write crash.
Fixes#3735.
Previously, when a loop detected an unwind of type ScopeType::Function
(which means a return statement was executed inside of the loop), it
would just return undefined. This set the VM's last_value to undefined,
when it should have been the returned value. This patch makes all loop
statements return the appropriate value in the above case.
While initialization common runtime objects like functions, prototypes,
etc, we don't really care about tracking transitions for each and every
property added to them.
This patch puts objects into a "disable transitions" mode while we call
initialize() on them. After that, adding more properties will cause new
transitions to be generated and added to the chain.
This gives a ~10% speed-up on test-js. :^)
When reifying a shape transition chain, look for the nearest previous
shape in the transition chain that has a property table already, and
use that as the starting point.
This achieves two things:
1. We do less work when reifying property tables that already have
partial property tables earlier in the chain.
2. This enables adding properties to a shape without performing a
transition. This will be useful for initializing runtime objects
with way fewer allocations. See next patch. :^)
There's no point in trying to achieve shape sharing for global objects,
so we can simply make the shape unique from the start and avoid making
a transition chain.
Previously whenever you would ask a Shape how many properties it had,
it would reify the property table into a HashMap and use HashMap::size()
to answer the question.
This can be a huge waste of time if we don't need the property table for
anything else, so this patch implements property count tracking in a
separate integer member of Shape. :^)
Since blocks can't be strict by themselves, it makes no sense for them
to store whether or not they are strict. Strict-ness is now stored in
the Program and FunctionNode ASTNodes. Fixes issue #3641
Each JS global object has its own "console", so it makes more sense to
store it in GlobalObject.
We'll need some smartness later to bundle up console messages from all
the different frames that make up a page later, but this works for now.
More work on decoupling the general runtime from Interpreter. The goal
is becoming clearer. Interpreter should be one possible way to execute
code inside a VM. In the future we might have other ways :^)
Okay, my vision here is improving. Interpreter should be a thing that
executes an AST. The scope stack is irrelevant to the VM proper,
so we can move that to the Interpreter. Same with execute_statement().
This patch moves the exception state, call stack and scope stack from
Interpreter to VM. I'm doing this to help myself discover what the
split between Interpreter and VM should be, by shuffling things around
and seeing what falls where.
With these changes, we no longer have a persistent lexical environment
for the current global object on the Interpreter's call stack. Instead,
we push/pop that environment on Interpreter::run() enter/exit.
Since it should only be used to find the global "this", and not for
variable storage (that goes directly into the global object instead!),
I had to insert some short-circuiting when walking the environment
parent chain during variable lookup.
Note that this is a "stepping stone" commit, not a final design.
Empty string is extremely common and we can avoid a lot of heap churn
by simply caching one in the VM. Primitive strings are immutable anyway
so there is no observable behavior change outside of fewer collections.
To make it a little clearer what this is for. (This is an RAII helper
class for adding and removing an Interpreter to a VM's list of the
currently active (executing code) Interpreters.)
Taking a big step towards a world of multiple global object, this patch
adds a new JS::VM object that houses the JS::Heap.
This means that the Heap moves out of Interpreter, and the same Heap
can now be used by multiple Interpreters, and can also outlive them.
The VM keeps a stack of Interpreter pointers. We push/pop on this
stack when entering/exiting execution with a given Interpreter.
This allows us to make this change without disturbing too much of
the existing code.
There is still a 1-to-1 relationship between Interpreter and the
global object. This will change in the future.
Ultimately, the goal here is to make Interpreter a transient object
that only needs to exist while you execute some code. Getting there
will take a lot more work though. :^)
Note that in LibWeb, the global JS::VM is called main_thread_vm(),
to distinguish it from future worker VM's.
Shape was allocating property tables inside visit_children(), which
could cause garbage collection to happen. It's not very good to start
a new garbage collection while you are in the middle of one already.
In the case of an exception in a property getter function we would not
return early, and a subsequent attempt to call the replacer function
would crash the interpreter due to call_internal() asserting.
Fixes#3548.
Interpreter::run() was so far being used both as the "public API entry
point" for running a JS::Program as well as internally to execute
JS::Statement|s of all kinds - this is now more distinctly separated.
A program as returned by the parser is still going through run(), which
is responsible for creating the initial global call frame, but all other
statements are executed via execute_statement() directly.
Fixes#3437, a regression introduced by adding ASSERT(!exception()) to
run() without considering the effects that would have on internal usage.
Looks like an oversight to me - we were not actually setting a new value
for m_array_size, which would cause arrays created with generic storage
to report a .length of 0.
This is already considered in put()/insert()/append_all() but not
set_array_like_size(), which crashed the interpreter with an assertion
when creating an array with more than SPARSE_ARRAY_THRESHOLD (200)
initial elements as the simple storage was being resized beyond its
limit.
Fixes#3382.