This changes the remaining uses of the following functions across LibJS:
- String::format() => String::formatted()
- dbg() => dbgln()
- printf() => out(), outln()
- fprintf() => warnln()
I also removed the relevant 'LogStream& operator<<' overloads as they're
not needed anymore.
This adds a new MetaProperty AST node which will be used for
'new.target' and 'import.meta' meta properties. The parser now
distinguishes between "in function context" and "in arrow function
context" (which is required for this).
When encountering TokenType::New we will attempt to parse it as meta
property and resort to regular new expression parsing if that fails,
much like the parsing of labelled statements.
This is a bit nicer for two reasons:
- The absence of line number/column information isn't based on 'values
are zero' anymore but on Optional's value
- When reporting syntax errors with position information other than the
current token's position we had to store line and column ourselves,
like this:
auto foo_start_line = m_parser_state.m_current_token.line_number();
auto foo_start_column = m_parser_state.m_current_token.line_column();
...
syntax_error("...", foo_start_line, foo_start_column);
Which now becomes:
auto foo_start= position();
...
syntax_error("...", foo_start);
This makes it easier to report correct positions for syntax errors
that only emerge a few tokens later :^)
By having the "is this a use strict directive?" logic in
parse_string_literal() we would apply it to *any* string literal, which
is incorrect and would lead to false positives - e.g.:
"use strict" + 1
`"use strict"`
"\123"; ({"use strict": ...})
Relevant part from the spec which is now implemented properly:
[...] and where each ExpressionStatement in the sequence consists
entirely of a StringLiteral token [...]
I also got rid of UseStrictDirectiveState which is not needed anymore.
Fixes#3903.
https://tc39.es/ecma262/#sec-functiondeclarations-in-ifstatement-statement-clauses
B.3.4 FunctionDeclarations in IfStatement Statement Clauses
The following augments the IfStatement production in 13.6:
IfStatement[Yield, Await, Return] :
if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default] else Statement[?Yield, ?Await, ?Return]
if ( Expression[+In, ?Yield, ?Await] ) Statement[?Yield, ?Await, ?Return] else FunctionDeclaration[?Yield, ?Await, ~Default]
if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default] else FunctionDeclaration[?Yield, ?Await, ~Default]
if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default]
This production only applies when parsing non-strict code. Code matching
this production is processed as if each matching occurrence of
FunctionDeclaration[?Yield, ?Await, ~Default] was the sole
StatementListItem of a BlockStatement occupying that position in the
source code. The semantics of such a synthetic BlockStatement includes
the web legacy compatibility semantics specified in B.3.3.
- A regular function can have duplicate parameters except in strict mode
or if its parameter list is not "simple" (has a default or rest
parameter)
- An arrow function can never have duplicate parameters
Compared to other engines I opted for more useful syntax error messages
than a generic "duplicate parameter name not allowed in this context":
"use strict"; function test(foo, foo) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34)
function test(foo, foo = 1) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20)
function test(foo, ...foo) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23)
(foo, foo) => {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)
https://tc39.es/ecma262/#sec-directive-prologues-and-the-use-strict-directive
A Use Strict Directive is an ExpressionStatement in a Directive Prologue
whose StringLiteral is either of the exact code point sequences
"use strict" or 'use strict'. A Use Strict Directive may not contain an
EscapeSequence or LineContinuation.
https://tc39.es/ecma262/#sec-additional-syntax-string-literals
The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:
Syntax
EscapeSequence::
CharacterEscapeSequence
LegacyOctalEscapeSequence
NonOctalDecimalEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence
LegacyOctalEscapeSequence::
OctalDigit [lookahead ∉ OctalDigit]
ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
FourToSeven OctalDigit
ZeroToThree OctalDigit OctalDigit
ZeroToThree :: one of
0 1 2 3
FourToSeven :: one of
4 5 6 7
NonOctalDecimalEscapeSequence :: one of
8 9
This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.
Note
It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:
function invalid() { "\7"; "use strict"; }
We're passing a token to this function, so m_current_token is actually
the next token - which leads to incorrect line/column numbers for string
literal syntax errors:
"\u"
^
Uncaught exception: [SyntaxError]: Malformed unicode escape sequence (line: 1, column: 5)
Rather than:
"\u"
^
Uncaught exception: [SyntaxError]: Malformed unicode escape sequence (line: 1, column: 1)
This was a regression introduced by 9ffe45b - a TryStatement without
'catch' clause *is* allowed, if it has a 'finally' clause. It is now
checked properly that at least one of both is present.
This separates matching/parsing of statements and declarations and
fixes a few edge cases where the parser would incorrectly accept a
declaration where only a statement is allowed - for example:
if (foo) const a = 1;
for (var bar;;) function b() {}
while (baz) class c {}
From the spec: https://tc39.es/ecma262/#sec-literals-numeric-literals
The SourceCharacter immediately following a NumericLiteral must not be
an IdentifierStart or DecimalDigit.
For example: 3in is an error and not the two input elements 3 and in.
This allows us to provide better error messages as we can point the
syntax error location to the exact first invalid parameter instead of
always the end of the function within a object literal or class
definition.
Before this change:
const Foo = { set bar() {} }
^
Uncaught exception: [SyntaxError]: Object setter property must have one argument (line: 1, column: 28)
class Foo { set bar() {} }
^
Uncaught exception: [SyntaxError]: Class setter method must have one argument (line: 1, column: 26)
After this change:
const Foo = { set bar() {} }
^
Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 23)
class Foo { set bar() {} }
^
Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 21)
The only possible downside of this change is that class getters/setters
and functions in objects are not distinguished in the message anymore -
I don't think that's important though, and classes are (mostly) just
syntactic sugar anyway.
I'm about to add even more options and a bunch of unnamed true/false
arguments is really not helpful. Let's make this a single parse options
parameter using bit flags.
If there's a newline between the closing paren and arrow it's not a
valid arrow function, ASI should kick in instead (it'll then fail with
"Unexpected token Arrow")
This simplifies try_parse_arrow_function_expression() and fixes a few
cases that should not produce an arrow function AST but did:
(a,,) => {}
(a b) => {}
(a ...b) => {}
(...b a) => {}
The new parsing logic checks whether parens are expected and uses
parse_function_parameters() if so, rolling back if a new syntax error
occurs during that. Otherwise it's just an identifier in which case we
parse the single parameter ourselves.
'continue' is no longer allowed outside of a loop, and an unlabeled
'break' is not longer allowed outside of a loop or switch statement.
Labeled 'break' statements are still allowed everywhere, even if the
label does not exist.
The check for invalid lhs and assignment to eval/arguments in strict
mode should happen for all kinds of assignment expressions, not just
AssignmentOp::Assignment.
Since blocks can't be strict by themselves, it makes no sense for them
to store whether or not they are strict. Strict-ness is now stored in
the Program and FunctionNode ASTNodes. Fixes issue #3641
The parser considers it a syntax error at the moment, other engines
throw a ReferenceError during runtime for ++foo(), --foo(), foo()++ and
foo()--, so I assume the spec defines this.
literal methods; add EnvrionmentRecord fields and methods to
LexicalEnvironment
Adding EnvrionmentRecord's fields and methods lets us throw an exception
when |this| is not initialized, which occurs when the super constructor
in a derived class has not yet been called, or when |this| has already
been initialized (the super constructor was already called).