Previously, throw and return completions would not be executed inside
the generator. This is incorrect, as throw and return need to perform
unwinds which can potentially execute more code inside the generator,
such as finally blocks.
This is done by also passing the completion type alongside the passed
in value. The continuation block will immediately extract and type and
value and perform the appropriate operation for the given type.
For normal completions, this is continuing as normal.
For throw completions, it will perform `throw <value>`.
For return completions, it will perform `return <value>`, which is a
`Yield return` in this case due to being inside a generator.
This also refactors GeneratorObject to properly send across the
completion type and value to the generator inside of trying to operate
on the completions itself.
This is a prerequisite for yield*, as it performs special iterator
operations when receiving a throw/return completion and does not
complete the generator like the regular yield would.
There's still more work to be done to make GeneratorObject::execute
be closer to the spec. It's mostly a restructuring of the existing
GeneratorObject::next_impl.
An executable is generated for the top-level script and for each
function. Strict mode can only be changed with the first statement of
the top-level script and each function, which corresponds directly to
Executable.
This is done by keeping track of all the labels that apply to a given
break/continue scope alongside their bytecode target. When a
break/continue with a label is generated, we scan from the most inner
scope to the most outer scope looking for the label, performing any
necessary unwinds on the way. Once the label is found, it is then
jumped to.
`delete` has to operate directly on Reference Records, so this
introduces a new set of operations called DeleteByValue, DeleteVariable
and DeleteById. They operate similarly to their Get counterparts,
except they end in creating a (temporary) Reference and calling delete_
on it.
When calling emit_load_from_reference with a MemberExpression, it is
only necessary to store the result of evaluating MemberExpression's
object when performing computed property lookup.
This allows us to skip unnecessary stores for identifier lookup.
For example, this would generate 3 unnecessary stores:
```
> Temporal.PlainDateTime.prototype.add
JS::Bytecode::Executable (REPL)
1:
[ 0] GetVariable 0 (Temporal)
[ 28] Store $2
[ 30] GetById 1 (PlainDateTime)
[ 40] Store $3
[ 48] GetById 2 (prototype)
[ 58] Store $4
[ 60] GetById 3 (add)
```
With this, it generates:
```
> Temporal.PlainDateTime.prototype.add
JS::Bytecode::Executable (REPL)
1:
[ 0] GetVariable 0 (Temporal)
[ 28] GetById 1 (PlainDateTime)
[ 38] GetById 2 (prototype)
[ 48] GetById 3 (add)
```
When we reach a block terminating instruction (e.g. Break, Throw),
we cannot generate anymore instructions after it. This would not allow
us to leave any lexical/variable environments.
This uses the mechanism introduced in ba9c49 to unwind environments
when we encounter these instructions.
Instead of crashing on the spot, return a descriptive error that will
eventually continue its days as a javascript "InternalError" exception.
This should make random crashes with BC less likely.
Using an Optional was extremely wasteful for function objects that don't
even have a bytecode executable.
This allows ECMAScriptFunctionObject to fit in a smaller size class.
This is a specialized string table for storing identifiers only.
Identifiers are always FlyStrings, which makes many common operations
faster by allowing O(1) comparison.
Instead of using Strings in the bytecode ops this adds a global string
table to the Executable struct which individual operations can refer
to using indices. This brings bytecode ops one step closer to being
pointer free.
This limits the size of each block (currently set to 1K), and gets us
closer to a canonical, more easily analysable bytecode format.
As a result of this, "Labels" are now simply entries to basic blocks.
Since there is no more 'conditional' jump (as all jumps are always
taken), JumpIf{True,False} are unified to JumpConditional, and
JumpIfNullish is renamed to JumpNullish.
Also fixes#7914 as a result of reimplementing the loop logic.
This change removes the mmap inside of Block in favor of a growing
vector of bytes. This is favorable for two reasons:
- We don't take more space than we need
- There is no limit to the growth of the vector (previously, if
the Block overstepped its 64kb boundary, it would just crash)
However, if that vector happens to resize, any pointer pointing into
that vector would become invalid. To avoid this, this commit adds an
InstructionHandle<Op> class which just stores a block and an offset
into that block.
This commit introduces the concept of an accumulator register to
LibJS's bytecode interpreter. The accumulator register is always
register 0, and most simple instructions use it for reading and
writing.
Not only does this slim down the AST, but it also simplifies a lot of
the code. For example, the generate_bytecode methods no longer need
to return an Optional<Register>, as any opcode which has a "return"
value will always put it into the accumulator.
This also renames the old Op::Load to Op::LoadImmediate, and uses
Op::Load to load from a register into the accumulator. There is
also an Op::Store to put the value in the accumulator into another
register.
After compiling bytecode, we should mark the memory read-only.
This currently does not work because it breaks instruction destruction.
I'm adding this anyway with a FIXME so we don't forget about it. :^)
This patch changes the LibJS bytecode to be a stream of instructions
packed one-after-the-other in contiguous memory, instead of a vector
of OwnPtr<Instruction>. This should be a lot more cache-friendly. :^)
Instructions are also devirtualized and instead have a type field
using a new Instruction::Type enum.
To iterate over a bytecode stream, one must now use
Bytecode::InstructionStreamIterator.
Unlike the convoluted unwind-until-scope-type mechanism in the AST
interpreter, "continue" maps to a simple Bytecode::Op::Jump here. :^)
We know where to jump based on a stack of "continuable scopes" that
we now maintain on the Bytecode::Generator as we go.
Note that this only supports bare "continue", not continue-with-label.
This introduces two new instructions: Jump and JumpIfFalse.
Jumps are made to a Bytecode::Label, which is a simple object that
represents a location in the bytecode stream.
Note that you may not always know the target of a jump when adding the
jump instruction itself, but we can just update the instruction later
on during codegen once we know where the jump target is.
The Bytecode::Interpreter now implements jumping via a jump slot that
gets checked after each instruction to see if a jump is pending.
If not, we just increment the PC as usual.
This patch begins the work of implementing JavaScript execution in a
bytecode VM instead of an AST tree-walk interpreter.
It's probably quite naive, but we have to start somewhere.
The basic idea is that you call Bytecode::Generator::generate() on an
AST node and it hands you back a Bytecode::Block filled with
instructions that can then be interpreted by a Bytecode::Interpreter.
This first version only implements two instructions: Load and Add. :^)
Each bytecode block has infinity registers, and the interpreter resizes
its register file to fit the block being executed.
Two new `js` options are added in this patch as well:
`-d` will dump the generated bytecode
`-b` will execute the generated bytecode
Note that unless `-d` and/or `-b` are specified, none of the bytecode
related stuff in LibJS runs at all. This is implemented in parallel
with the existing AST interpreter. :^)