The motivation for this change is twofold:
- Returning a JS::Value is misleading as one would expect it to carry
some meaningful information, like maybe the error object that's being
created, but in fact it is always empty. Supposedly to serve as a
shortcut for the common case of "throw and return empty value", but
that's just leading us to my second point.
- Inconsistent usage / coding style: as of this commit there are 114
uses of throw_exception() discarding its return value and 55 uses
directly returning the call result (in LibJS, not counting LibWeb);
with the first style often having a more explicit empty value (or
nullptr in some cases) return anyway.
One more line to always make the return value obvious is should be
worth it.
So now it's basically always these steps, which is already being used in
the majority of cases (as outlined above):
- Throw an exception. This mutates interpreter state by updating
m_exception and unwinding, but doesn't return anything.
- Let the caller explicitly return an empty value, nullptr or anything
else itself.
Finally use Symbol.iterator protocol in language features :) currently
only used in for-of loops and spread expressions, but will have more
uses later (Maps, Sets, Array.from, etc).
literal methods; add EnvrionmentRecord fields and methods to
LexicalEnvironment
Adding EnvrionmentRecord's fields and methods lets us throw an exception
when |this| is not initialized, which occurs when the super constructor
in a derived class has not yet been called, or when |this| has already
been initialized (the super constructor was already called).
This is a helper function based on the getter/setter definition logic from
ObjectExpression::execute() to look up an Accessor property if it already
exists, define a new Accessor property if it doesn't exist, and set the getter or
setter function on the Accessor.
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
Includes all traps except the following: [[Call]], [[Construct]],
[[OwnPropertyKeys]].
An important implication of this commit is that any call to any virtual
Object method has the potential to throw an exception. These methods
were not checked in this commit -- a future commit will have to protect
these various method calls throughout the codebase.
When calling Object.defineProperty, there is now a difference between
omitting a descriptor attribute and specifying that it is false. For
example, "{}" and "{ configurable: false }" will have different
attribute values.
This patch adds function declaration hoisting. The mechanism
is similar to var hoisting. Hoisted function declarations are to be put
before the hoisted var declarations, hence they have to be treated
separately.
This patch adds an IndexedProperties object for storing indexed
properties within an Object. This accomplishes two goals: indexed
properties now have an associated descriptor, and objects now gracefully
handle sparse properties.
The IndexedProperties class is a wrapper around two other classes, one
for simple indexed properties storage, and one for general indexed
property storage. Simple indexed property storage is the common-case,
and is simply a vector of properties which all have attributes of
default_attributes (writable, enumerable, and configurable).
General indexed property storage is for a collection of indexed
properties where EITHER one or more properties have attributes other
than default_attributes OR there is a property with a large index (in
particular, large is '200' or higher).
Indexed properties are now treated relatively the same as storage within
the various Object methods. Additionally, there is a custom iterator
class for IndexedProperties which makes iteration easy. The iterator
skips empty values by default, but can be configured otherwise.
Likewise, it evaluates getters by default, but can be set not to.
Previously, the Object class had many different types of functions for
each action. For example: get_by_index, get(PropertyName),
get(FlyString). This is a bit verbose, so these methods have been
shortened to simply use the PropertyName structure. The methods then
internally call _by_index if necessary. Note that the _by_index
have been made private to enforce this change.
Secondly, a clear distinction has been made between "putting" and
"defining" an object property. "Putting" should mean modifying a
(potentially) already existing property. This is akin to doing "a.b =
'foo'".
This implies two things about put operations:
- They will search the prototype chain for setters and call them, if
necessary.
- If no property exists with a particular key, the put operation
should create a new property with the default attributes
(configurable, writable, and enumerable).
In contrast, "defining" a property should completely overwrite any
existing value without calling setters (if that property is
configurable, of course).
Thus, all of the many JS objects have had any "put" calls changed to
"define_property" calls. Additionally, "put_native_function" and
"put_native_property" have had their "put" replaced with "define".
Finally, "put_own_property" has been made private, as all necessary
functionality should be exposed with the put and define_property
methods.
This changes Accessor's m_{getter,setter} from Value to Function* which
seems like a better API to me - a getter/setter must either be a
function or missing, and the creation of an accessor with other values
must be prevented by the parser and Object.defineProperty() anyway.
Also add Accessor::set_{getter,setter}() so we can reuse an already
created accessor when evaluating an ObjectExpression with getter/setter
shorthand syntax.
As these parameter-less overloads don't change the value's type and
just assume Type::Number, naming them as_i32() and as_size_t() is more
appropriate.
This patch is unfortunately rather large and might make some things feel
bloated, but it is necessary to fix a few flaws in LibJS, primarily
blindly coercing values to numbers without exception checks - i.e.
interpreter.argument(0).to_i32(); // can fail!!!
Some examples where the interpreter would actually crash:
var o = { toString: () => { throw Error() } };
+o;
o - 1;
"foo".charAt(o);
"bar".repeat(o);
To fix this, we now have the following...
to_double(Interpreter&)
to_i32()
to_i32(Interpreter&)
to_size_t()
to_size_t(Interpreter&)
...and a whole lot of exception checking.
There's intentionally no to_double(), use as_double() directly instead.
This way we still can use these convenient utility functions but don't
need to check for exceptions if we are sure the value already is a
number.
Fixes#2267.
Passing a Heap& to it only to then call interpreter() on that is weird.
Let's just give it the Interpreter& directly, like some of the other
to_something() functions.
This commit adds the following classes: SymbolObject, SymbolConstructor,
SymbolPrototype, and Symbol. This commit does not introduce any
new functionality to the Object class, so they cannot be used as
property keys in objects.
There are now two API's on Value:
- Value::to_string(Interpreter&) -- may throw.
- Value::to_string_without_side_effects() -- will never throw.
These are some pretty big sweeping changes, so it's possible that I did
some part the wrong way. We'll work it out as we go. :^)
Fixes#2123.
The ECMAScript spec defines multiple equality operations which are used
all over the spec; this patch introduces them. Of course, the two
primary equality operations are AbtractEquals ('==') and StrictEquals
('==='), which have been renamed to 'abstract_eq' and 'strict_eq' in
this patch.
In support of the two operations mentioned above, the following have
also been added: SameValue, SameValueZero, and SameValueNonNumeric.
These are important to have, because they are used elsewhere in the spec
aside from the two primary equality comparisons.
This required 2 changes:
1. In the parser, create a new variable scope, so the variable is
declared in it instead of the scope in which the 'for' is found.
2. On execute, push the variable into the newly created block. Existing
code created an empty block (no variables, no arguments) which
allows Interpreter::enter_scope() to skip the creation of a new
environment, therefore when the variable initializer is executed, it
sets the variable to the outer scope. By attaching the variable to
the new block, the block gets a new environment.
This is only needed for 'let' / 'const' declarations, since 'var'
declarations are expected to leak.
Fixes: #2103
"[Function.length is] the number of formal parameters. This number
excludes the rest parameter and only includes parameters before
the first one with a default value." - MDN
To make processing tagged template literals easier, template literals
will now add one empty StringLiteral before and after each template
expression *if* there's no other string - e.g.:
`${foo}` -> "", foo, ""
`test${foo}${bar}test` -> "test", foo, "", bar, "test"
This also matches the behaviour of many other parsers.
This now matches the output of
Program
(Variables)
...
(Children)
...
or
FunctionDeclaration 'foo'
(Parameters)
...
(Body)
...
etc.
Also don't print each consequent statement index, it doesn't add any
value.
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.
When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.
When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.
The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).
TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):
> `foo
> bar`
'foo
bar'
Adds the ability for function arguments to have default values. This
works for standard functions as well as arrow functions. Default values
are not printed in a <function>.toString() call, as nodes cannot print
their source string representation.
This commit introduces a way to get an object's own properties in the
correct order. The "correct order" for JS object properties is first all
array-like index properties (numeric keys) sorted by insertion order,
followed by all string properties sorted by insertion order.
Objects also now print correctly in the repl! Before this commit:
courage ~/js-tests $ js
> ({ foo: 1, bar: 2, baz: 3 })
{ bar: 2, foo: 1, baz: 3 }
After:
courage ~/js-tests $ js
> ({ foo: 1, bar: 2, baz: 3 })
{ foo: 1, bar: 2, baz: 3 }
This patch teaches UpdateExpression how to use a Reference. Some other
changes were necessary to keep tests working:
A Reference can now also refer to a local or global variable. This is
not fully aligned with the spec since we don't have a Record concept.
Expression nodes can now be asked to produce a Reference. We then use
this to implement the "delete" operator without downcasting the child
node to a MemberExpression manually.
Implement the syntax and behavor necessary to support array literals
such as [...[1, 2, 3]]. A type error is thrown if the target of the
spread operator does not evaluate to an array (though it should
eventually just check for an iterable).
Note that the spread token's name is TripleDot, since the '...' token is
used for two features: spread and rest. Calling it anything involving
'spread' or 'rest' would be a bit confusing.
It turns out "delete" is actually a unary op :)
This patch implements deletion of object properties, it doesn't yet
work for casually deleting properties from the global object.
When deleting a property from an object, we switch that object to
having a unique shape, no longer sharing shapes with others.
Once an object has a unique shape, it no longer needs to care about
shape transitions.
JS::Value already has the empty state ({} or Value() gives you one.)
Use this instead of wrapping Value in Optional in some places.
I've also added Value::value_or(Value) so you can easily provide a
fallback value when one is not present.
- Let undefined variables throw a ReferenceError by using
Identifier::execute() rather than doing variable lookup manually and
ASSERT()ing
- Coerce value to number rather than ASSERT()ing
- Make code DRY
- Add tests
A MarkedValueList is basically a Vector<JS::Value> that registers with
the Heap and makes sure that the stored values don't get GC'd.
Before this change, we were unsafely keeping Vector<JS::Value> in some
places, which is out-of-reach for the live reference finding logic
since Vector puts its elements on the heap by default.
We now pass all the JavaScript tests even when running with "js -g",
which does a GC on every heap allocation.
Everyone who constructs an Object must now pass a prototype object when
applicable. There's still a fair amount of code that passes something
fetched from the Interpreter, but this brings us closer to being able
to detach prototypes from Interpreter eventually.
Let's start moving towards native JS objects taking their prototype as
a constructor argument.
This will eventually allow us to move prototypes off of Interpreter and
into GlobalObject.
This patch replaces the old variable lookup logic with a new one based
on lexical environments.
This brings us closer to the way JavaScript is actually specced, and
also gives us some basic support for closures.
The interpreter's call stack frames now have a pointer to the lexical
environment for that frame. Each lexical environment can have a chain
of parent environments.
Before calling a Function, we first ask it to create_environment().
This gives us a new LexicalEnvironment for that function, which has the
function's lexical parent's environment as its parent. This allows
inner functions to access variables in their outer function:
function foo() { <-- LexicalEnvironment A
var x = 1;
function() { <-- LexicalEnvironment B (parent: A)
console.log(x);
}
}
If we return the result of a function expression from a function, that
new function object will keep a reference to its parent environment,
which is how we get closures. :^)
I'm pretty sure I didn't get everything right here, but it's a pretty
good start. This is quite a bit slower than before, but also correcter!
Since declarations are now hoisted and handled on scope entry, the job
of a VariableDeclaration becomes to actually initialize variables.
As such, we can remove the part where we insert variables into the
nearest relevant scope. Less work == more speed! :^)
"var" declarations are hoisted to the nearest function scope, while
"let" and "const" are hoisted to the nearest block scope.
This is done by the parser, which keeps two scope stacks, one stack
for the current var scope and one for the current let/const scope.
When the interpreter enters a scope, we walk all of the declarations
and insert them into the variable environment.
We don't support the temporal dead zone for let/const yet.
Many other parsers call it with this name.
Also Type can be confusing in this context since the DeclarationType is
not the type (number, string, etc.) of the variables that are being
declared by the VariableDeclaration.