By doing that all instructions required for instantiation are emitted
once in compilation and then reused for subsequent calls, instead of
running generic instantiation process for each call.
Instead of keeping bytecode as a set of disjoint basic blocks on the
malloc heap, bytecode is now a contiguous sequence of bytes(!)
The transformation happens at the end of Bytecode::Generator::generate()
and the only really hairy part is rerouting jump labels.
This required solving a few problems:
- The interpreter execution loop had to change quite a bit, since we
were storing BasicBlock pointers all over the place, and control
transfer was done by redirecting the interpreter's current block.
- Exception handlers & finalizers are now stored per-bytecode-range
in a side table in Executable.
- The interpreter now has a plain program counter instead of a stream
iterator. This actually makes error stack generation a bit nicer
since we just have to deal with a number instead of reaching into
the iterator.
This yields a 25% performance improvement on this microbenchmark:
for (let i = 0; i < 1_000_000; ++i) { }
But basically everything gets faster. :^)
If a function has the following properties:
- uses only local variables and registers
- does not use `this`
- does not use `new.target`
- does not use `super`
- does not use direct eval() calls
then it is possible to entirely skip function environment allocation
because it will never be used
This change adds gathering of information whether a function needs to
access `this` from environment and updates `prepare_for_ordinary_call()`
to skip allocation when possible.
For now, this optimisation is too aggressively blocked; e.g. if `this`
is used in a function scope, then all functions in outer scopes have to
allocate an environment. It could be improved in the future, although
this implementation already allows skipping >80% of environment
allocations on Discord, GitHub and Twitter.
Before this change both ExecutionContext and CallFrame were created
before executing function/module/script with a couple exceptions:
- executable created for default function argument evaluation has to
run in function's execution context.
- `execute_ast_node()` where executable compiled for ASTNode has to be
executed in running execution context.
This change moves all members previously owned by CallFrame into
ExecutionContext, and makes two exceptions where an executable that does
not have a corresponding execution context saves and restores registers
before running.
Now, all execution state lives in a single entity, which makes it a bit
easier to reason about and opens opportunities for optimizations, such
as moving registers and local variables into a single array.
This patch moves us away from the accumulator-based bytecode format to
one with explicit source and destination registers.
The new format has multiple benefits:
- ~25% faster on the Kraken and Octane benchmarks :^)
- Fewer instructions to accomplish the same thing
- Much easier for humans to read(!)
Because this change requires a fundamental shift in how bytecode is
generated, it is quite comprehensive.
Main implementation mechanism: generate_bytecode() virtual function now
takes an optional "preferred dst" operand, which allows callers to
communicate when they have an operand that would be optimal for the
result to go into. It also returns an optional "actual dst" operand,
which is where the completion value (if any) of the AST node is stored
after the node has "executed".
One thing of note that's new: because instructions can now take locals
as operands, this means we got rid of the GetLocal instruction.
A side-effect of that is we have to think about the temporal deadzone
(TDZ) a bit differently for locals (GetLocal would previously check
for empty values and interpret that as a TDZ access and throw).
We now insert special ThrowIfTDZ instructions in places where a local
access may be in the TDZ, to maintain the correct behavior.
There are a number of progressions and regressions from this test:
A number of async generator tests have been accidentally fixed while
converting the implementation to the new bytecode format. It didn't
seem useful to preserve bugs in the original code when converting it.
Some "does eval() return the correct completion value" tests have
regressed, in particular ones related to propagating the appropriate
completion after control flow statements like continue and break.
These are all fairly obscure issues, and I believe we can continue
working on them separately.
The net test262 result is a progression though. :^)
There was recently a normative change to this AO in ECMA-262. See:
https://github.com/tc39/ecma262/commit/5eaee2f
It turns out we already implemented this to align with web-reality
before it was codified in the spec. This was a bit difficult to reason
without spec text and with a somewhat ad-hoc implementation. So this
patch aligns our implementation with the spec. There should not be any
behavior change.
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:
```
Optional<I> opt;
if constexpr (IsSigned<I>)
opt = view.to_int<I>();
else
opt = view.to_uint<I>();
```
For us.
The main goal here however is to have a single generic number conversion
API between all of the String classes.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
In particular, this patch focuses on:
- Updating the old "import assertions" to the new "import attributes"
- Allowing realms as module import referrer
The number of registers in a call frame never changes, so we can
allocate it at the end of the CallFrame object and save ourselves the
cost of allocating separate Vector storage for every call frame.
Instead of allocating these in a mixture of ways, we now always put
them on the malloc heap, and keep an intrusive linked list of them
that we can iterate for GC marking purposes.
(Instead of MarkedVector<Value>.) This is a step towards not storing
argument lists in MarkedVector<Value> at all. Note that they still end
up in MarkedVectors since that's what ExecutionContext has.
These functions all have a very common case that can be dealt with a
very simple inline check, often avoiding the need to call an out-of-line
function. This patch moves the common case to inline functions in a new
ValueInlines.h header (necessary due to header dependency issues..)
8% speed-up on the entire Kraken benchmark :^)
When a substitution refers to a 2-digit capture group that doesn't exist
we need to check if the first digit refers to an existing capture group.
In other words, '$10' should be treated as capture group #1, followed by
the literal '0' if 1 is a valid capture group but 10 is not.
This makes the Dromaeo "dom-query" subtest run to completion.
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
This is part of an old normative change that happened soon after
Andreas made `super` closer to spec in 1270df2.
See https://github.com/tc39/ecma262/pull/2267/
This was introduced into bytecode by virtue of copy and paste :^)
Bytecode results:
Summary:
Diff Tests:
+2 ✅ -2 ❌
Saving vector of local variables names in ECMAScriptFunctionObject
will allow to get a name by index in case message of ReferenceError
needs to contain a variable name.
The JS::VM now owns the one Bytecode::Interpreter. We no longer have
multiple bytecode interpreters, and there is no concept of a "current"
bytecode interpreter.
If you ask for VM::bytecode_interpreter_if_exists(), it will return null
if we're not running the program in "bytecode enabled" mode.
If you ask for VM::bytecode_interpreter(), it will return a bytecode
interpreter in all modes. This is used for situations where even the AST
interpreter switches to bytecode mode (generators, etc.)
Some of these are allocated upon initialization of the intrinsics, and
some lazily, but in neither case the getters actually return a nullptr.
This saves us a whole bunch of pointer dereferences (as NonnullGCPtr has
an `operator T&()`), and also has the interesting side effect of forcing
us to explicitly use the FunctionObject& overload of call(), as passing
a NonnullGCPtr is ambigous - it could implicitly be turned into a Value
_or_ a FunctionObject& (so we have to dereference manually).
This includes an Error::create overload to create an Error from a UTF-8
StringView. If creating a String from that view fails, the factory will
return an OOM InternalError instead. VM::throw_completion can also make
use of this overload via its perfect forwarding.
In this patch only top level and not the more complicated for loop using
statements are supported. Also, as noted in the latest meeting of tc39
async parts of the spec are not stage 3 thus not included.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
This makes construction of Utf16String fallible in OOM conditions. The
immediate impact is that PrimitiveString must then be fallible as well,
as it may either transcode UTF-8 to UTF-16, or create a UTF-16 string
from ropes.
There are a couple of places where it is very non-trivial to propagate
the error further. A FIXME has been added to those locations.