These were out-of-line because we had some ideas about marking
instruction streams PROT_READ only, but that seems pretty arbitrary and
there's a lot of performance to be gained by putting these inline.
Instead of storing source offsets with each instruction, we now keep
them in a side table in Executable.
This shrinks each instruction by 8 bytes, further improving locality.
Instead of keeping bytecode as a set of disjoint basic blocks on the
malloc heap, bytecode is now a contiguous sequence of bytes(!)
The transformation happens at the end of Bytecode::Generator::generate()
and the only really hairy part is rerouting jump labels.
This required solving a few problems:
- The interpreter execution loop had to change quite a bit, since we
were storing BasicBlock pointers all over the place, and control
transfer was done by redirecting the interpreter's current block.
- Exception handlers & finalizers are now stored per-bytecode-range
in a side table in Executable.
- The interpreter now has a plain program counter instead of a stream
iterator. This actually makes error stack generation a bit nicer
since we just have to deal with a number instead of reaching into
the iterator.
This yields a 25% performance improvement on this microbenchmark:
for (let i = 0; i < 1_000_000; ++i) { }
But basically everything gets faster. :^)
Instead of emitting a NewString instruction to construct a primitive
string from a parsed literal, we now instantiate the PrimitiveString on
the heap during codegen.
The JIT compiler was an interesting experiment, but ultimately the
security & complexity cost of doing arbitrary code generation at runtime
is far too high.
In subsequent commits, the bytecode format will change drastically, and
instead of rewriting the JIT to fit the new bytecode, this patch simply
removes the JIT instead.
Other engines, JavaScriptCore in particular, have already proven that
it's possible to handle the vast majority of contemporary web content
with an interpreter. They are currently ~5x faster than us on benchmarks
when running without a JIT. We need to catch up to them before
considering performance techniques with a heavy security cost.
This works by walking a backtrace until the currently executing
native executable is found, and then mapping the native address
to its bytecode instruction.
Previously every file that included Executable.h (which is pretty much
most LibJS and LibHTML files, given that VM.h needs it) had the whole
definition of LibRegex, which was slowing down source parsing.
This is a specialized string table for storing identifiers only.
Identifiers are always FlyStrings, which makes many common operations
faster by allowing O(1) comparison.