Instead of emitting a NewString instruction to construct a primitive
string from a parsed literal, we now instantiate the PrimitiveString on
the heap during codegen.
The JIT compiler was an interesting experiment, but ultimately the
security & complexity cost of doing arbitrary code generation at runtime
is far too high.
In subsequent commits, the bytecode format will change drastically, and
instead of rewriting the JIT to fit the new bytecode, this patch simply
removes the JIT instead.
Other engines, JavaScriptCore in particular, have already proven that
it's possible to handle the vast majority of contemporary web content
with an interpreter. They are currently ~5x faster than us on benchmarks
when running without a JIT. We need to catch up to them before
considering performance techniques with a heavy security cost.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
We previously had a concept of unique shapes, which meant that they
couldn't be shared between multiple objects.
Object shapes became unique in three situations:
- They were the shape of the global object.
- They had more than 100 properties added to them.
- They had one or more properties deleted from them.
Unfortunately, unique shapes presented an annoying problem for inline
caches, and we added a "unique shape serial number" for being able to
tell that a unique shape had been mutated.
This patch gets rid of the concept of unique shapes, simplifying all
the caching code, since inline caches can now simply perform a shape
check and then we're good.
To make this possible, we now have the concept of delete transitions,
which occur when a property is deleted from a shape.
Note that this patch by itself introduces a performance regression in
some situtations, since we now create a lot more shapes, and marking
their property keys can be very heavy. This will be addressed in a
subsequent patch.
This works by walking a backtrace until the currently executing
native executable is found, and then mapping the native address
to its bytecode instruction.
This is currently only used in the bytecode dump to annotate to where
unwinds lead per block, but will be hooked up to the virtual machine in
the next commit.
Previously every file that included Executable.h (which is pretty much
most LibJS and LibHTML files, given that VM.h needs it) had the whole
definition of LibRegex, which was slowing down source parsing.
This works by adding source start/end offset to every bytecode
instruction. In the future we can make this more efficient by keeping
a map of bytecode ranges to source ranges in the Executable instead,
but let's just get traces working first.
Co-Authored-By: Andrew Kaster <akaster@serenityos.org>
The RegExpLiteral AST node already has the parsed regex::Parser::Result
so let's plumb that over to the bytecode executable instead of reparsing
the regex every time NewRegExp is executed.
~12% speed-up on language/literals/regexp/S7.8.5_A2.1_T2.js in test262.
Using a special instruction to access global variables allows skipping
the environment chain traversal for them and going directly to the
module/global environment. Currently, this instruction only caches the
offset for bindings that belong to the global object environment.
However, there is also an opportunity to cache the offset in the global
declarative record.
This change results in a 57% increase in speed for
imaging-gaussian-blur.js in Kraken.
Since we can't rely on shape identity (i.e its pointer address) for
unique shapes, give them a serial number that increments whenever a
mutation occurs.
Inline caches can then compare this serial number against what they
have seen before.
The instructions GetById and GetByIdWithThis now remember the last-seen
Shape, and if we see the same object again, we reuse the property offset
from last time without doing a new lookup.
This allows us to use Object::get_direct(), bypassing the entire lookup
machinery and saving lots of time.
~23% speed-up on Kraken/ai-astar.js :^)
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
An executable is generated for the top-level script and for each
function. Strict mode can only be changed with the first statement of
the top-level script and each function, which corresponds directly to
Executable.
This is a specialized string table for storing identifiers only.
Identifiers are always FlyStrings, which makes many common operations
faster by allowing O(1) comparison.