This commit implements parsing for `yield *expr`, and the multiple
ways something can or can't be parsed like that.
Also makes yield-from a TODO in the bytecode generator.
Behold, the glory of javascript syntax:
```js
// 'yield' = expression in generators.
function* foo() {
yield
*bar; // <- Syntax error here, expression can't start with *
}
// 'yield' = identifier anywhere else.
function foo() {
yield
*bar; // Perfectly fine, this is just `yield * bar`
}
```
This patch adds an "argument index" field to Identifier AST nodes.
If the Identifier refers to a function parameter in the currently
open function scope, we stash the index of the parameter here.
This will allow us to implement much faster direct access to function
argument variables.
These will be partly handled by the relevant ScopeNode due to
hoisting, same basic idea as function declarations.
VariableDeclaration needs to do some work, but let's stub it out
first and start empty.
EnterUnwindContext pushes an unwind context (exception handler and/or
finalizer) onto a stack.
LeaveUnwindContext pops the unwind context from that stack.
Upon return to the interpreter loop we check whether the VM has an
exception pending. If no unwind context is available we return from the
loop. If an exception handler is available we clear the VM's exception,
put the exception value into the accumulator register, clear the unwind
context's handler and jump to the handler. If no handler is available
but a finalizer is available we save the exception value + metadata (for
later use by ContinuePendingUnwind), clear the VM's exception, pop the
unwind context and jump to the finalizer.
ContinuePendingUnwind checks whether a saved exception is available. If
no saved exception is available it jumps to the resume label. Otherwise
it stores the exception into the VM.
The Jump after LeaveUnwindContext could be integrated into the
LeaveUnwindContext instruction. I've kept them separate for now to make
the bytecode more readable.
> try { 1; throw "x" } catch (e) { 2 } finally { 3 }; 4
1:
[ 0] EnterScope
[ 10] EnterUnwindContext handler:@4 finalizer:@3
[ 38] EnterScope
[ 48] LoadImmediate 1
[ 60] NewString 1 ("x")
[ 70] Throw
<for non-terminated blocks: insert LeaveUnwindContext + Jump @3 here>
2:
[ 0] LoadImmediate 4
3:
[ 0] EnterScope
[ 10] LoadImmediate 3
[ 28] ContinuePendingUnwind resume:@2
4:
[ 0] SetVariable 0 (e)
[ 10] EnterScope
[ 20] LoadImmediate 2
[ 38] LeaveUnwindContext
[ 3c] Jump @3
String Table:
0: e
1: x
Added Increment and Decrement bytecode ops to support this. Postfix
updates use a temporary register to preserve the original value.
Note that this patch only implements Identifier updates. Member
expression updates are a TODO.
This commit introduces the concept of an accumulator register to
LibJS's bytecode interpreter. The accumulator register is always
register 0, and most simple instructions use it for reading and
writing.
Not only does this slim down the AST, but it also simplifies a lot of
the code. For example, the generate_bytecode methods no longer need
to return an Optional<Register>, as any opcode which has a "return"
value will always put it into the accumulator.
This also renames the old Op::Load to Op::LoadImmediate, and uses
Op::Load to load from a register into the accumulator. There is
also an Op::Store to put the value in the accumulator into another
register.
Unlike the convoluted unwind-until-scope-type mechanism in the AST
interpreter, "continue" maps to a simple Bytecode::Op::Jump here. :^)
We know where to jump based on a stack of "continuable scopes" that
we now maintain on the Bytecode::Generator as we go.
Note that this only supports bare "continue", not continue-with-label.
This also required making Bytecode::Op::Jump support lazy linking
to a target label.
I left a FIXME here about having the "if" statement return the result
value from the taken branch statement. That's what the AST interpreter
does but I'm not sure if it's actually required.
If there's a current Bytecode::Interpreter in action, ScriptFunction
will now compile itself into bytecode and execute in that context.
This patch also adds the Return bytecode instruction so that we can
actually return values from called functions. :^)
Return values are propagated from callee to caller via the caller's
$0 register. Bytecode::Interpreter now keeps a stack of register
"windows". These are not very efficient, but it should be pretty
straightforward to convert them to e.g a sliding register window
architecture later on.
This is pretty dang cool! :^)
This patch adds the Call bytecode instruction which is emitted for the
CallExpression AST node.
It's pretty barebones and doesn't handle 'this' values properly, etc.
But it can perform basic function calls! :^)
Note that the called function will *not* execute as bytecode, but will
simply fall back into the old codepath and use the AST interpreter.
This was quite straightforward using the same label/jump machinery that
we added for while statements.
The main addition here is a new JumpIfTrue bytecode instruction.
This introduces two new instructions: Jump and JumpIfFalse.
Jumps are made to a Bytecode::Label, which is a simple object that
represents a location in the bytecode stream.
Note that you may not always know the target of a jump when adding the
jump instruction itself, but we can just update the instruction later
on during codegen once we know where the jump target is.
The Bytecode::Interpreter now implements jumping via a jump slot that
gets checked after each instruction to see if a jump is pending.
If not, we just increment the PC as usual.
- NewString (allocates a new PrimitiveString from the GC heap)
- GetVariable (retrieves a variable in the current scope)
- SetVariable (assigns a variable in the current scope)
This patch begins the work of implementing JavaScript execution in a
bytecode VM instead of an AST tree-walk interpreter.
It's probably quite naive, but we have to start somewhere.
The basic idea is that you call Bytecode::Generator::generate() on an
AST node and it hands you back a Bytecode::Block filled with
instructions that can then be interpreted by a Bytecode::Interpreter.
This first version only implements two instructions: Load and Add. :^)
Each bytecode block has infinity registers, and the interpreter resizes
its register file to fit the block being executed.
Two new `js` options are added in this patch as well:
`-d` will dump the generated bytecode
`-b` will execute the generated bytecode
Note that unless `-d` and/or `-b` are specified, none of the bytecode
related stuff in LibJS runs at all. This is implemented in parallel
with the existing AST interpreter. :^)
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The auto naming of function expressions is a purely syntactic
decision, so shouldn't be decided based on the dynamic type of
an assignment. This moves the decision making into the parser.
One icky hack is that we add a field to FunctionExpression to
indicate whether we can autoname. The real solution is to actually
generate a CompoundExpression node so that the parser can make
the correct decision, however this would have a potentially
significant run time cost.
This does not correct the behaviour for class expressions.
Patch from Anonymous.
We now store 32-bit integers as 32-bit integers directly which avoids
having to convert them from doubles when they're only used as 32-bit
integers anyway. :^)
This patch feels a bit incomplete and there's a lot of opportunities
to take advantage of this information. We'll have to find and exploit
them eventually.
For various statements the spec states:
Return NormalCompletion(empty).
In those cases we have been returning undefined so far, which is
incorrect.
In other cases it states:
Return Completion(UpdateEmpty(stmtCompletion, undefined)).
Which essentially means a statement is evaluated and its completion
value returned if non-empty, and undefined otherwise.
While not actually noticeable in normal scripts as the VM's "last value"
can't be accessed from JS code directly (with the exception of eval(),
see below), it provided an inconsistent experience in the REPL:
> if (true) 42;
42
> if (true) { 42; }
undefined
This also fixes the case where eval() would return undefined if the last
executed statement is not a value-producing one:
eval("1;;;;;")
eval("1;{}")
eval("1;var a;")
As a consequence of the changes outlined above, these now all correctly
return 1.
See https://tc39.es/ecma262/#sec-block-runtime-semantics-evaluation,
"NOTE 2".
Fixes#3609.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.