This patch implements:
- Spec compliant [[Call]] and [[Construct]] internal slots, as virtual
FunctionObject::internal_{call,construct}(). These effectively replace
the old virtual FunctionObject::{call,construct}(), but with several
advantages:
- Clear and consistent naming, following the object internal methods
- Use of completions
- internal_construct() returns an Object, and not Value! This has been
a source of confusion for a long time, since in the spec there's
always an Object returned but the Value return type in LibJS meant
that this could not be fully trusted and something could screw you
over.
- Arguments are passed explicitly in form of a MarkedValueList,
allowing manipulation (BoundFunction). We still put them on the
execution context as a lot of code depends on it (VM::arguments()),
but not from the Call() / Construct() AOs anymore, which now allows
for bypassing them and invoking [[Call]] / [[Construct]] directly.
Nothing but Call() / Construct() themselves do that at the moment,
but future additions to ECMA262 or already existing web specs might.
- Spec compliant, standalone Call() and Construct() AOs: currently the
closest we have is VM::{call,construct}(), but those try to cater to
all the different function object subclasses at once, resulting in a
horrible mess and calling AOs with functions they should never be
called with; most prominently PrepareForOrdinaryCall and
OrdinaryCallBindThis, which are only for ECMAScriptFunctionObject.
As a result this also contains an implicit optimization: we no longer
need to create a new function environment for NativeFunctions - which,
worth mentioning, is what started this whole crusade in the first place
:^)
Most switch statements don't have any lexically scoped declarations,
so let's avoid allocating an environment in the common case where we
don't have to.
This commit partially reverts "LibJS: Make accessing the current
function's arguments cheaper".
While the change passed all the currently passing test262 tests, it
seems to have _some_ flaw that silently breaks with some real-world
websites.
As the speedup with negligible at best, let's just revert it until we
can implement it more correctly.
We now propagate this flag to FunctionDeclaration, and then also into
ECMAScriptFunctionObject.
This will be used to disable optimizations that aren't safe in the
presence of direct eval().
Instead of going through an environment record, make arguments of the
currently executing function generate references via the argument index,
which can later be resolved directly through the ExecutionContext.
This patch introduces the "environment coordinate" concept, which
encodes the distance from a variable access to the binding it ends up
resolving to.
EnvironmentCoordinate has two fields:
- hops: The number of hops up the lexical environment chain we have
to make before getting to the resolved binding.
- index: The index of the resolved binding within its declarative
environment record.
Whenever a variable lookup resolves somewhere inside a declarative
environment, we now cache the coordinates and reuse them in subsequent
lookups. This is achieved via a coordinate cache in JS::Identifier.
Note that non-strict direct eval() breaks this optimization and so it
will not be performed if the resolved environment has been permanently
screwed by eval().
This makes variable access *significantly* faster. :^)
The idea here is simple: If the block statement doesn't contain any
lexical declarations, we don't need to allocate, initialize and
eventually garbage collect a new declarative environment.
This even makes lookups across nested blocks slightly faster as we don't
have to traverse a chain of empty environments anymore - instead, the
execution context just stores the outermost non-empty one.
This doesn't speed up test-js considerably, but has a noticeable effect
on test262 and real-world web content :^)
This gives FunctionNode a "might need arguments object" boolean flag and
sets it based on the simplest possible heuristic for this: if we
encounter an identifier called "arguments" or "eval" up to the next
(nested) function declaration or expression, we won't need an arguments
object. Otherwise, we *might* need one - the final decision is made in
the FunctionDeclarationInstantiation AO.
Now, this is obviously not perfect. Even if you avoid eval, something
like `foo.arguments` will still trigger a false positive - but it's a
start and already massively cuts down on needlessly allocated objects,
especially in real-world code that is often minified, and so a full
"arguments" identifier will be an actual arguments object more often
than not.
To illustrate the actual impact of this change, here's the number of
allocated arguments objects during a full test-js run:
Before:
- Unmapped arguments objects: 78765
- Mapped arguments objects: 2455
After:
- Unmapped arguments objects: 18
- Mapped arguments objects: 37
This results in a ~5% speedup of test-js on my Linux host machine, and
about 3.5% on i686 Serenity in QEMU (warm runs, average of 5).
The following microbenchmark (calling an empty function 1M times) runs
25% faster on Linux and 45% on Serenity:
function foo() {}
for (var i = 0; i < 1_000_000; ++i)
foo();
test262 reports no changes in either direction, apart from a speedup :^)
Before this we used an ad-hoc combination of references and 'variables'
stored in a hashmap. This worked in most cases but is not spec like.
Additionally hoisting, dynamically naming functions and scope analysis
was not done properly.
This patch fixes all of that by:
- Implement BindingInitialization for destructuring assignment.
- Implementing a new ScopePusher which tracks the lexical and var
scoped declarations. This hoists functions to the top level if no
lexical declaration name overlaps. Furthermore we do checking of
redeclarations in the ScopePusher now requiring less checks all over
the place.
- Add methods for parsing the directives and statement lists instead
of having that code duplicated in multiple places. This allows
declarations to pushed to the appropriate scope more easily.
- Remove the non spec way of storing 'variables' in
DeclarativeEnvironment and make Reference follow the spec instead of
checking both the bindings and 'variables'.
- Remove all scoping related things from the Interpreter. And instead
use environments as specified by the spec. This also includes fixing
that NativeFunctions did not produce a valid FunctionEnvironment
which could cause issues with callbacks and eval. All
FunctionObjects now have a valid NewFunctionEnvironment
implementation.
- Remove execute_statements from Interpreter and instead use
ASTNode::execute everywhere this simplifies AST.cpp as you no longer
need to worry about which method to call.
- Make ScopeNodes setup their own environment. This uses four
different methods specified by the spec
{Block, Function, Eval, Global}DeclarationInstantiation with the
annexB extensions.
- Implement and use NamedEvaluation where specified.
Additionally there are fixes to things exposed by these changes to eval,
{for, for-in, for-of} loops and assignment.
Finally it also fixes some tests in test-js which where passing before
but not now that we have correct behavior :^).
Since there are only a number of statements where labels can actually be
used we now also only store labels when necessary.
Also now tracks the first continue usage of a label since this might not
be valid but that can only be determined after we have parsed the
statement.
Also ensures the correct error does not get wiped by load_state.
When we encounter a default clause in a switch statement, we should not
execute it immediately, instead we need to wait until all case clauses
have been executed as a matching case clause can break from the
switch/case.
The code is nowhere close to the spec, so instead of fixing it properly
I just made it slightly worse, but correct. Needs a complete refactor at
some point.
At a later point this will indicate whether some FunctionObject "has a
[[Construct]] internal method" (separate from the current FunctionObject
call() / construct()), to help with a more spec-compliant implementation
of [[Call]] and [[Construct]].
This means that it is no longer relevant to just NativeFunction.
The old name is the result of the perhaps somewhat confusingly named
abstract operation OrdinaryFunctionCreate(), which creates an "ordinary
object" (https://tc39.es/ecma262/#ordinary-object) in contrast to an
"exotic object" (https://tc39.es/ecma262/#exotic-object).
However, the term "Ordinary Function" is not used anywhere in the spec,
instead the created object is referred to as an "ECMAScript Function
Object" (https://tc39.es/ecma262/#sec-ecmascript-function-objects), so
let's call it that.
The "ordinary" vs. "exotic" distinction is important because there are
also "Built-in Function Objects", which can be either implemented as
ordinary ECMAScript function objects, or as exotic objects (our
NativeFunction).
More work needs to be done to move a lot of infrastructure to
ECMAScriptFunctionObject in order to make FunctionObject nothing more
than an interface for objects that implement [[Call]] and optionally
[[Construct]].
In the spec, this happens in the EvaluateCall abstract operation
(https://tc39.es/ecma262/#sec-evaluatecall), and the order is defined
as:
3. Let argList be ? ArgumentListEvaluation of arguments.
4. If Type(func) is not Object, throw a TypeError exception.
5. If IsCallable(func) is false, throw a TypeError exception.
In LibJS this is handled by CallExpression::execute(), which had the
callee function check first and would therefore never evaluate the
arguments for a non-function callee.