1.25x speed-up on this microbenchmark:
let o = { get x() { return 1; } };
for (let i = 0; i < 10_000_000; ++i)
o.x;
I looked into this because I noticed getter invocation when profiling
long-running WPT tests. We already had the mechanism for non-getter
properties, and the change to support getters turned out to be trivial.
The typeof operator has a very small set of possible resulting strings,
so let's make it much faster by caching those strings on the VM.
~8x speed-up on this microbenchmark:
for (let i = 0; i < 10_000_000; ++i) {
typeof i;
}
If no header includes the prototype of a function, then it cannot be
used from outside the translation unit it was defined in. In that case,
it should be marked as `static`, in order to avoid possible ODR
problems, unnecessary exported symbols, and allow the compiler to better
optimize those.
If this warning triggers in a function defined in a header, `inline`
needs to be added, otherwise if the header is included in more than one
TU, it will fail to link with a duplicate definition error.
The reason this diff got so big is that Lagom-only code wasn't built
with this flag even in Serenity times.
We now defer looking up the various identifiers by IdentifierTableIndex
until the last moment. This allows us to avoid the retrieval in common
cases like when a property access is cached.
Knocks a ~12% item off the profile on https://ventrella.com/Clusters/
When resuming execution of a suspended function, we must not overwrite
any cached `this` value with something from the ExecutionContext.
This was causing an empty JS::Value to leak into the VM when resuming
an async arrow function, since the "this mode" for such functions is
lexical and thus ExecutionContext will have an empty `this`.
It became a problem due to the bytecode optimization where we allow
ourselves to assume that `this` remains cached after we've executed a
ResolveThisBinding in this (or the first) basic block of the executable.
Fixes https://github.com/LadybirdBrowser/ladybird/issues/138
Instead of displaying locals as "locN", we now show them as "name~N".
This makes it a lot easier to follow bytecode dumps, especially in
longer functions.
Note that we keep displaying the local index, to avoid confusion in case
there are multiple separate locals with the same name in one executable.
Functions that don't have a FunctionEnvironment will get their `this`
value from the ExecutionContext. This patch stops generating
ResolveThisBinding instructions at all for functions like that, and
instead pre-populates the `this` register when entering a new bytecode
executable.
We already have a dedicated register slot for `this`, so instead of
having ResolveThisBinding take a `dst` operand, just write the value
directly into the `this` register every time.
Allows to skip function environment allocation for non-arrow functions
if the only reason it is needed is to hold `this` binding.
The parser is changed to do following:
- If a function is an arrow function and uses `this` then all functions
in a scope chain are marked to allocate function environment for
`this` binding.
- If a function uses `new.target` then all functions in a scope chain
are marked to allocate function environment.
`ordinary_call_bind_this()` is changed to put `this` value in execution
context when function environment allocation is skipped.
35% improvement in Octane/typescript.js
50% improvement in Octane/deltablue.js
19% improvement in Octane/raytrace.js
This allows us to skip allocating a function environment in cases where
it was previously impossible because the arguments object needed a
binding.
This change does not bring visible improvement in Kraken or Octane
benchmarks but seems useful to have anyway.
By doing this, we can remove all the special-case checks for `length`
from the generic GetById and GetByIdWithThis code paths, making every
other property lookup a bit faster.
Small progressions on most benchmarks, and some larger progressions:
- 12% on Octane/crypto.js
- 6% on Kraken/ai-astar.js
It wasn't safe to use addition_would_overflow(a, -b) to check if
subtraction (a - b) would overflow, since it doesn't cover this case.
I don't know why we didn't have subtraction_would_overflow(), so this
patch adds it. :^)
With this only `ContinuePendingUnwind` needs to dynamically check if a
scheduled return needs to go through a `finally` block, making the
interpreter loop a bit nicer
Instead of SetVariable having 2x2 modes for variable/lexical and
initialize/set, those 4 modes are now separate instructions, which
makes each instruction much less branchy.
This means that SetVariable instructions will now remember which
(relative) environment contains the targeted binding, letting it bypass
the full binding resolution machinery on subsequent accesses.
Merging registers, constants and locals into single vector means:
- Better data locality
- No need to check type in Interpreter::get() and Interpreter::set()
which are very hot functions
Performance improvement is visible in almost all Octane and Kraken
tests.
When comparing two numbers, we can avoid a lot of implicit type
conversion nonsense and go straight to comparison, saving time in the
most common case.
These were out-of-line because we had some ideas about marking
instruction streams PROT_READ only, but that seems pretty arbitrary and
there's a lot of performance to be gained by putting these inline.
Instead, generate bytecode to execute their AST nodes and save the
resulting operands inside the NewClass instruction.
Moving property expression evaluation to happen before NewClass
execution also moves along creation of new private environment and
its population with private members (private members should be visible
during property evaluation).
Before:
- NewClass
After:
- CreatePrivateEnvironment
- AddPrivateName
- ...
- AddPrivateName
- NewClass
- LeavePrivateEnvironment
If the minimal amount of required bindings is known in advance, it could
be used to ensure capacity to avoid resizing the internal vector that
holds bindings.
By doing that all instructions required for instantiation are emitted
once in compilation and then reused for subsequent calls, instead of
running generic instantiation process for each call.
We now fuse sequences like [LessThan, JumpIf] to JumpLessThan.
This is only allowed for temporaries (i.e VM registers) with no other
references to them.
This removes a layer of indirection in the bytecode where we had to make
sure all the initializer elements were laid out in sequential registers.
Array expressions no longer clobber registers permanently, and they can
be reused immediately afterwards.