This patch adds six of the standard type arrays and tries to share as
much code as possible:
- Uint8Array
- Uint16Array
- Uint32Array
- Int8Array
- Int16Array
- Int32Array
This should be using the individual flag boolean properties rather than
the [[OriginalFlags]] internal slot.
Use an enumerator macro here for brevity, this will be useful for other
things as well. :^)
- Default values should depend on arguments being undefined, not being
missing
- "(?:)" for empty pattern happens in RegExp.prototype.source, not the
constructor
This makes RegExpObject compile and store a Regex<ECMA262>, adds
all flag-related properties, and implements `RegExpPrototype.test()`
(complete with 'lastIndex' support) :^)
It should be noted that this only implements `test()' using the builtin
`exec()'.
If a receiver is given, e.g. via Reflect.get/set(), forward it to the
target object's get()/put() or use it as last argument of the trap
function. The default value is the Proxy object itself.
We can't just to_string() the PropertyName, it might be a symbol.
Instead to_value() it and then use to_string_without_side_effects() as
usual.
Fixes#4062.
This prevents stack overflows when calling infinite/deep recursive
functions, e.g.:
const f = () => f(); f();
JSON.stringify({}, () => ({ foo: "bar" }));
new Proxy({}, { get: (_, __, p) => p.foo }).foo;
The VM caches a StackInfo object to not slow down function calls
considerably. VM::push_call_frame() will throw an exception if
necessary (plain Error with "RuntimeError" as its .name).
As the global object is constructed and initialized in a different way
than most other objects we were not setting its prototype! This made
things like "globalThis.toString()" fail unexpectedly.
Some things, like (the non-generic version of) Array.prototype.pop(),
check is_empty() to determine whether an action, like removing elements,
can be performed. We need to know the array-like size for that, not the
size of the underlying storage, which can be different - and is not
something IndexedProperties should expose so I removed its size().
Fixes#3948.
- We have to check if the property name is a string before calling
as_string() on it
- We can't as_number() the same property name but have to use the parsed
index number
Fixes#3950.
We can't assume that property names can be converted to strings anymore,
as we have symbols. Use name.to_value() instead.
This makes something like this possible:
new Proxy(Object, { get(t, p) { return t[p] } })[Symbol.hasInstance]
This fixes Array.prototype.{join,toString}() crashing with arrays
containing themselves, i.e. circular references.
The spec is suspiciously silent about this, and indeed engine262, a
"100% spec compliant" ECMA-262 implementation, can't handle these cases.
I had a look at some major engines instead and they all seem to keep
track or check for circular references and return an empty string for
already seen objects.
- SpiderMonkey: "AutoCycleDetector detector(cx, obj)"
- V8: "CycleProtectedArrayJoin<JSArray>(...)"
- JavaScriptCore: "StringRecursionChecker checker(globalObject, thisObject)"
- ChakraCore: "scriptContext->CheckObject(thisArg)"
To keep things simple & consistent this uses the same pattern as
JSONObject, MarkupGenerator and js: simply putting each seen object in a
HashTable<Object*>.
Fixes#3929.
This renames Object::to_primitive() to Object::ordinary_to_primitive()
for two reasons:
- No confusion with Value::to_primitive()
- To match the spec's name
Also change existing uses of Object::to_primitive() to
Value::to_primitive() when the spec uses the latter (which will still
call Object::ordinary_to_primitive()). Object::to_string() has been
removed as it's not needed anymore (and nothing the spec uses).
This makes it possible to overwrite an object's toString and valueOf and
have them provide results for anything that uses to_primitive() - e.g.:
const o = { toString: undefined, valueOf: () => 42 };
Number(o) // 42, previously NaN
["foo", o].toString(); // "foo,42", previously "foo,[object Object]"
++o // 43, previously NaN
etc.
This should not just inherit Object.prototype.toString() (and override
Object::to_string()) but be its own function, i.e.
'RegExp.prototype.toString !== Object.prototype.toString'.
When value.to_string() throws an exception it returns a null string in
which case we must not construct a valid PropertyName.
Also ASSERT in PropertyName(String) and PropertyName(FlyString) to
prevent this from happening in the future.
Fixes#3941.
Two issues:
- throw_exception() with ErrorType::InstanceOfOperatorBadPrototype would
receive rhs_prototype.to_string_without_side_effects(), which would
ASSERT_NOT_REACHED() as to_string_without_side_effects() must not be
called on an empty value. It should (and now does) receive the RHS
value instead as the message is "'prototype' property of {} is not an
object".
- Value::instance_of() was missing an exception check after calling
has_instance_method, to_boolean() on an empty value result would crash
as well.
Fixes#3930.
This adds a new MetaProperty AST node which will be used for
'new.target' and 'import.meta' meta properties. The parser now
distinguishes between "in function context" and "in arrow function
context" (which is required for this).
When encountering TokenType::New we will attempt to parse it as meta
property and resort to regular new expression parsing if that fails,
much like the parsing of labelled statements.
By having the "is this a use strict directive?" logic in
parse_string_literal() we would apply it to *any* string literal, which
is incorrect and would lead to false positives - e.g.:
"use strict" + 1
`"use strict"`
"\123"; ({"use strict": ...})
Relevant part from the spec which is now implemented properly:
[...] and where each ExpressionStatement in the sequence consists
entirely of a StringLiteral token [...]
I also got rid of UseStrictDirectiveState which is not needed anymore.
Fixes#3903.
https://tc39.es/ecma262/#sec-functiondeclarations-in-ifstatement-statement-clauses
B.3.4 FunctionDeclarations in IfStatement Statement Clauses
The following augments the IfStatement production in 13.6:
IfStatement[Yield, Await, Return] :
if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default] else Statement[?Yield, ?Await, ?Return]
if ( Expression[+In, ?Yield, ?Await] ) Statement[?Yield, ?Await, ?Return] else FunctionDeclaration[?Yield, ?Await, ~Default]
if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default] else FunctionDeclaration[?Yield, ?Await, ~Default]
if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default]
This production only applies when parsing non-strict code. Code matching
this production is processed as if each matching occurrence of
FunctionDeclaration[?Yield, ?Await, ~Default] was the sole
StatementListItem of a BlockStatement occupying that position in the
source code. The semantics of such a synthetic BlockStatement includes
the web legacy compatibility semantics specified in B.3.3.
B.1.3 HTML-like Comments
The syntax and semantics of 11.4 is extended as follows except that this
extension is not allowed when parsing source code using the goal symbol
Module:
Syntax (only relevant part included)
SingleLineHTMLCloseComment ::
LineTerminatorSequence HTMLCloseComment
HTMLCloseComment ::
WhiteSpaceSequence[opt] SingleLineDelimitedCommentSequence[opt] --> SingleLineCommentChars[opt]
Fixes#3810.
ES 5(.1) described parsing of the function body string as:
https://www.ecma-international.org/ecma-262/5.1/#sec-15.3.2.1
7. If P is not parsable as a FormalParameterList[opt] then throw a SyntaxError exception.
8. If body is not parsable as FunctionBody then throw a SyntaxError exception.
We implemented it as building the source string of a complete function
and feeding that to the parser, with the same outcome. ES 2015+ does
exactly that, but with newlines at certain positions:
https://tc39.es/ecma262/#sec-createdynamicfunction
16. Let bodyString be the string-concatenation of 0x000A (LINE FEED), ? ToString(bodyArg), and 0x000A (LINE FEED).
17. Let prefix be the prefix associated with kind in Table 49.
18. Let sourceString be the string-concatenation of prefix, " anonymous(", P, 0x000A (LINE FEED), ") {", bodyString, and "}".
This patch updates the generated source string to match these
requirements. This will make certain edge cases work, e.g.
'new Function("-->")', where the user supplied input must be placed on
its own line to be valid syntax.
This is, and I can't stress this enough, a lot better than all the
manual bounds checking and indexing that was going on before.
Also fixes a small bug where "\u{}" wouldn't get rejected as invalid
unicode escape sequence.
We only use expect(...).toEval() / not.toEval() for checking syntax
errors, where we obviously can't put the code in a regular function. For
runtime errors we do exactly that, so toEval() should not fail - this
allows us to use undefined identifiers in syntax tests.
Newlines after line continuation were inserted into the string
literals. This patch makes the parser ignore the newlines after \ and
also makes it so that "use strict" containing a line continuation is
not a valid "use strict".
- A regular function can have duplicate parameters except in strict mode
or if its parameter list is not "simple" (has a default or rest
parameter)
- An arrow function can never have duplicate parameters
Compared to other engines I opted for more useful syntax error messages
than a generic "duplicate parameter name not allowed in this context":
"use strict"; function test(foo, foo) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34)
function test(foo, foo = 1) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20)
function test(foo, ...foo) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23)
(foo, foo) => {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)
https://tc39.es/ecma262/#sec-directive-prologues-and-the-use-strict-directive
A Use Strict Directive is an ExpressionStatement in a Directive Prologue
whose StringLiteral is either of the exact code point sequences
"use strict" or 'use strict'. A Use Strict Directive may not contain an
EscapeSequence or LineContinuation.
https://tc39.es/ecma262/#sec-additional-syntax-string-literals
The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:
Syntax
EscapeSequence::
CharacterEscapeSequence
LegacyOctalEscapeSequence
NonOctalDecimalEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence
LegacyOctalEscapeSequence::
OctalDigit [lookahead ∉ OctalDigit]
ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
FourToSeven OctalDigit
ZeroToThree OctalDigit OctalDigit
ZeroToThree :: one of
0 1 2 3
FourToSeven :: one of
4 5 6 7
NonOctalDecimalEscapeSequence :: one of
8 9
This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.
Note
It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:
function invalid() { "\7"; "use strict"; }
This was a regression introduced by 9ffe45b - a TryStatement without
'catch' clause *is* allowed, if it has a 'finally' clause. It is now
checked properly that at least one of both is present.
This separates matching/parsing of statements and declarations and
fixes a few edge cases where the parser would incorrectly accept a
declaration where only a statement is allowed - for example:
if (foo) const a = 1;
for (var bar;;) function b() {}
while (baz) class c {}
From the spec: https://tc39.es/ecma262/#sec-literals-numeric-literals
The SourceCharacter immediately following a NumericLiteral must not be
an IdentifierStart or DecimalDigit.
For example: 3in is an error and not the two input elements 3 and in.
Otherwise we crash the interpreter when an exception is thrown during
evaluation of the while or do/while test expression - which is easily
caused by a ReferenceError - e.g.:
while (someUndefinedVariable) {
// ...
}
If there's a newline between the closing paren and arrow it's not a
valid arrow function, ASI should kick in instead (it'll then fail with
"Unexpected token Arrow")
This simplifies try_parse_arrow_function_expression() and fixes a few
cases that should not produce an arrow function AST but did:
(a,,) => {}
(a b) => {}
(a ...b) => {}
(...b a) => {}
The new parsing logic checks whether parens are expected and uses
parse_function_parameters() if so, rolling back if a new syntax error
occurs during that. Otherwise it's just an identifier in which case we
parse the single parameter ourselves.
When changing the attributes of an existing property of an object with
unique shape we must not change the PropertyMetadata offset.
Doing so without resizing the underlying storage vector caused an OOB
write crash.
Fixes#3735.
Previously, when a loop detected an unwind of type ScopeType::Function
(which means a return statement was executed inside of the loop), it
would just return undefined. This set the VM's last_value to undefined,
when it should have been the returned value. This patch makes all loop
statements return the appropriate value in the above case.
'continue' is no longer allowed outside of a loop, and an unlabeled
'break' is not longer allowed outside of a loop or switch statement.
Labeled 'break' statements are still allowed everywhere, even if the
label does not exist.
The check for invalid lhs and assignment to eval/arguments in strict
mode should happen for all kinds of assignment expressions, not just
AssignmentOp::Assignment.
Since blocks can't be strict by themselves, it makes no sense for them
to store whether or not they are strict. Strict-ness is now stored in
the Program and FunctionNode ASTNodes. Fixes issue #3641
In the case of an exception in a property getter function we would not
return early, and a subsequent attempt to call the replacer function
would crash the interpreter due to call_internal() asserting.
Fixes#3548.
This fixes two cases obj[expr] and obj[expr]() (MemberExpression and
CallExpression respectively) when expr throws an exception and results
in an empty value, causing a crash by passing the invalid PropertyName
created by computed_property_name() to Object::get() without checking it
first.
Fixes#3459.
This fixes two issues with running a TryStatement finalizer:
- Temporarily store and clear the exception, if any, so we can run the
finalizer block statement without it getting in our way, which could
have unexpected side effects otherwise (and will likely return early
somewhere).
- Stop unwinding so more than one child node of the finalizer
BlockStatement is executed if an exception has been thrown previously
(which would have called unwind(ScopeType::Try)). Re-throwing as
described above ensures we still unwind after the finalizer, if
necessary.
Also add some tests specifically for try/catch/finally blocks, we
didn't have any!
Interpreter::run() was so far being used both as the "public API entry
point" for running a JS::Program as well as internally to execute
JS::Statement|s of all kinds - this is now more distinctly separated.
A program as returned by the parser is still going through run(), which
is responsible for creating the initial global call frame, but all other
statements are executed via execute_statement() directly.
Fixes#3437, a regression introduced by adding ASSERT(!exception()) to
run() without considering the effects that would have on internal usage.
Test files created with:
$ for f in Libraries/LibJS/Tests/builtins/Date/Date.prototype.get*js; do
cp $f $(echo $f | sed -e 's/get/getUTC/') ;
done
$ rm Libraries/LibJS/Tests/builtins/Date/Date.prototype.getUTCTime.js
$ git add Libraries/LibJS/Tests/builtins/Date/Date.prototype.getUTC*.js
$ ls Libraries/LibJS/Tests/builtins/Date/Date.prototype.getUTC*.js | \
xargs sed -i -e 's/get/getUTC/g'
Year computation has to be based on seconds, not days, in case
t is < 0 but t / __seconds_per_day is 0.
Year computation also has to consider negative timestamps.
With this, days is always positive and <= the number of days in the
year, so base the tm_wday computation directly on the timestamp,
and do it first, before t is modified in the year computation.
In C, % can return a negative number if the left operand is negative,
compensate for that.
Tested via test-js. (Except for tm_wday, since we don't implement
Date.prototype.getUTCDate() yet.)
LibJS doesn't store stacks for exception objects, so this
only amends test-common.js's __expect() with an optional
`details` function that can produce a more detailed error
message, and it lets test-js.cpp read and print that
error message. I added the optional details parameter to
a few matchers, most notably toBe() where it now prints
expected and actual value.
It'd be nice to have line numbers of failures, but that
seems hard to do with the current design, and this is already
much better than the current state.
The constructor with a string argument isn't implemented yet, but
this implements the other variants.
The timestamp constructor doens't handle negative timestamps correctly.
Out-of-bound and invalid arguments aren't handled correctly.
The optional 2nd and 3rd arguments are not yet implemented.
This assumes that `this` is the Array constructor and doesn't yet
implement the more general behavior in the ES6 spec that allows
transferring this method to other constructors.
With this, typing `"\xff"` into Browser's console no longer
makes the app crash.
While here, also make the \u handler call append_codepoint()
instead of calling an overload where it's not immediately clear
which overload is getting called. This has no behavior change.