Commit graph

171 commits

Author SHA1 Message Date
Luke Wilde
bd4c29322c LibJS: Allow division after IdentifierNames in optional chain
The following syntax is valid:
```js
e?.example / 1.2
```

Previously, the `/` would be treated as a unterminated regex literal,
because it was calling the regular `consume` instead of
`consume_and_allow_division`.

This is what is done when parsing IdentifierNames in
parse_secondary_expression when a period is encountered.

Allows us to parse clients-main-[hash].js on https://ubereats.com/
2024-11-11 20:19:26 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Andreas Kling
db790dda62 LibJS: Remove hand-rolled type information in JS AST in favor of RTTI 2021-01-01 19:34:07 +01:00
AnotherTest
8ca0e8325a LibJS: Don't save rule start positions along with the parser state
This fixes #4617.
Also fixes the small problem where some save states would be leaked.
2020-12-29 17:39:42 +01:00
AnotherTest
d0363bca01 LibJS: `save_state()' before creating a RulePosition
Fixes #4617.
2020-12-29 10:51:33 +01:00
AnotherTest
b34b681811 LibJS: Track source positions all the way down to exceptions
This makes exceptions have a trace of source positions too, which could
probably be helpful in making fancier error tracebacks.
2020-12-29 00:58:43 +01:00
Stephan Unverwerth
be9c2feff0 LibJS: Fix parsing of numeric object keys
Numeric keys were interpreted as their source text, leading to
something like {0x10:true} to end up as {"0x10":true}
instead of {16:true}
2020-12-27 23:04:09 +01:00
Linus Groh
5eb1f752ab LibJS: Use new format functions everywhere
This changes the remaining uses of the following functions across LibJS:

- String::format() => String::formatted()
- dbg() => dbgln()
- printf() => out(), outln()
- fprintf() => warnln()

I also removed the relevant 'LogStream& operator<<' overloads as they're
not needed anymore.
2020-12-06 18:52:52 +01:00
Linus Groh
3ac7fb9f6c LibJS: Disallow 'with' statement in strict mode 2020-11-28 20:33:41 +01:00
Andreas Kling
d617120499 LibJS: Parse "with" statements :^) 2020-11-28 17:16:48 +01:00
Linus Groh
39a1c9d827 LibJS: Implement 'new.target'
This adds a new MetaProperty AST node which will be used for
'new.target' and 'import.meta' meta properties. The parser now
distinguishes between "in function context" and "in arrow function
context" (which is required for this).
When encountering TokenType::New we will attempt to parse it as meta
property and resort to regular new expression parsing if that fails,
much like the parsing of labelled statements.
2020-11-02 22:40:59 +01:00
Linus Groh
e07a39c816 LibJS: Replace 'size_t line, size_t column' with 'Optional<Position>'
This is a bit nicer for two reasons:

- The absence of line number/column information isn't based on 'values
  are zero' anymore but on Optional's value
- When reporting syntax errors with position information other than the
  current token's position we had to store line and column ourselves,
  like this:

      auto foo_start_line = m_parser_state.m_current_token.line_number();
      auto foo_start_column = m_parser_state.m_current_token.line_column();
      ...
      syntax_error("...", foo_start_line, foo_start_column);

  Which now becomes:

      auto foo_start= position();
      ...
      syntax_error("...", foo_start);

  This makes it easier to report correct positions for syntax errors
  that only emerge a few tokens later :^)
2020-11-02 22:40:59 +01:00
Linus Groh
9e80c67608 LibJS: Fix "use strict" directive false positives
By having the "is this a use strict directive?" logic in
parse_string_literal() we would apply it to *any* string literal, which
is incorrect and would lead to false positives - e.g.:

    "use strict" + 1
    `"use strict"`
    "\123"; ({"use strict": ...})

Relevant part from the spec which is now implemented properly:

[...] and where each ExpressionStatement in the sequence consists
entirely of a StringLiteral token [...]

I also got rid of UseStrictDirectiveState which is not needed anymore.

Fixes #3903.
2020-11-02 13:13:54 +01:00
Linus Groh
a598a2c19d LibJS: Function declarations in if statement clauses
https://tc39.es/ecma262/#sec-functiondeclarations-in-ifstatement-statement-clauses

B.3.4 FunctionDeclarations in IfStatement Statement Clauses

The following augments the IfStatement production in 13.6:

    IfStatement[Yield, Await, Return] :
        if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default] else Statement[?Yield, ?Await, ?Return]
        if ( Expression[+In, ?Yield, ?Await] ) Statement[?Yield, ?Await, ?Return] else FunctionDeclaration[?Yield, ?Await, ~Default]
        if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default] else FunctionDeclaration[?Yield, ?Await, ~Default]
        if ( Expression[+In, ?Yield, ?Await] ) FunctionDeclaration[?Yield, ?Await, ~Default]

This production only applies when parsing non-strict code. Code matching
this production is processed as if each matching occurrence of
FunctionDeclaration[?Yield, ?Await, ~Default] was the sole
StatementListItem of a BlockStatement occupying that position in the
source code. The semantics of such a synthetic BlockStatement includes
the web legacy compatibility semantics specified in B.3.3.
2020-10-31 15:25:12 +01:00
Linus Groh
563d3c8055 LibJS: Require initializer for 'const' variable declaration 2020-10-30 23:43:38 +01:00
Linus Groh
b4e51249e9 LibJS: Always insert semicolon after do-while statement if missing
https://tc39.es/ecma262/#sec-additions-and-changes-that-introduce-incompatibilities-with-prior-editions

11.9.1: In ECMAScript 2015, Automatic Semicolon Insertion adds a
semicolon at the end of a do-while statement if the semicolon is
missing. This change aligns the specification with the actual behaviour
of most existing implementations.
2020-10-28 21:11:32 +01:00
Linus Groh
7112031bfb LibJS: Use message from invalid token in syntax error 2020-10-26 21:38:34 +01:00
Linus Groh
dca9e4ec10 LibJS: Implement rules for duplicate function parameters
- A regular function can have duplicate parameters except in strict mode
  or if its parameter list is not "simple" (has a default or rest
  parameter)
- An arrow function can never have duplicate parameters

Compared to other engines I opted for more useful syntax error messages
than a generic "duplicate parameter name not allowed in this context":

    "use strict"; function test(foo, foo) {}
                                     ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34)

    function test(foo, foo = 1) {}
                       ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20)

    function test(foo, ...foo) {}
                          ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23)

    (foo, foo) => {}
          ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)
2020-10-25 12:56:02 +01:00
Linus Groh
2adcabb6b3 LibJS: Disallow escape sequence/line continuation in use strict directive
https://tc39.es/ecma262/#sec-directive-prologues-and-the-use-strict-directive

A Use Strict Directive is an ExpressionStatement in a Directive Prologue
whose StringLiteral is either of the exact code point sequences
"use strict" or 'use strict'. A Use Strict Directive may not contain an
EscapeSequence or LineContinuation.
2020-10-24 16:34:01 +02:00
Linus Groh
4fb96afafc LibJS: Support LegacyOctalEscapeSequence in string literals
https://tc39.es/ecma262/#sec-additional-syntax-string-literals

The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:

Syntax

    EscapeSequence::
        CharacterEscapeSequence
        LegacyOctalEscapeSequence
        NonOctalDecimalEscapeSequence
        HexEscapeSequence
        UnicodeEscapeSequence

    LegacyOctalEscapeSequence::
        OctalDigit [lookahead ∉ OctalDigit]
        ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
        FourToSeven OctalDigit
        ZeroToThree OctalDigit OctalDigit

    ZeroToThree :: one of
        0 1 2 3

    FourToSeven :: one of
        4 5 6 7

    NonOctalDecimalEscapeSequence :: one of
        8 9

This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.

Note

It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:

function invalid() { "\7"; "use strict"; }
2020-10-24 16:34:01 +02:00
Linus Groh
9f036959e8 LibJS: Report correct line/column for string literal syntax errors
We're passing a token to this function, so m_current_token is actually
the next token - which leads to incorrect line/column numbers for string
literal syntax errors:

    "\u"
        ^
    Uncaught exception: [SyntaxError]: Malformed unicode escape sequence (line: 1, column: 5)

Rather than:

    "\u"
    ^
    Uncaught exception: [SyntaxError]: Malformed unicode escape sequence (line: 1, column: 1)
2020-10-24 16:34:01 +02:00
Linus Groh
d6f8c52245 LibJS: Allow try statement with only finally clause
This was a regression introduced by 9ffe45b - a TryStatement without
'catch' clause *is* allowed, if it has a 'finally' clause. It is now
checked properly that at least one of both is present.
2020-10-24 16:34:01 +02:00
Linus Groh
80bb62b9cc LibJS: Distinguish between statement and declaration
This separates matching/parsing of statements and declarations and
fixes a few edge cases where the parser would incorrectly accept a
declaration where only a statement is allowed - for example:

    if (foo) const a = 1;
    for (var bar;;) function b() {}
    while (baz) class c {}
2020-10-23 19:13:06 +02:00
Linus Groh
f8ae6fa713 LibJS: Disallow NumericLiteral immediately followed by Identifier
From the spec: https://tc39.es/ecma262/#sec-literals-numeric-literals

The SourceCharacter immediately following a NumericLiteral must not be
an IdentifierStart or DecimalDigit.

For example: 3in is an error and not the two input elements 3 and in.
2020-10-23 19:13:06 +02:00
Linus Groh
80bb22788f LibJS: Don't allow TryStatement without catch clause 2020-10-23 19:13:06 +02:00
Linus Groh
15642874f3 LibJS: Support all line terminators (LF, CR, LS, PS)
https://tc39.es/ecma262/#sec-line-terminators
2020-10-22 10:06:30 +02:00
Linus Groh
1e86379327 LibJS: Rest parameter in setter functions is a syntax error 2020-10-20 20:27:58 +02:00
Linus Groh
6331d45a6f LibJS: Move checks for invalid getter/setter params to parse_function_node
This allows us to provide better error messages as we can point the
syntax error location to the exact first invalid parameter instead of
always the end of the function within a object literal or class
definition.

Before this change:

    const Foo = { set bar() {} }
                               ^
    Uncaught exception: [SyntaxError]: Object setter property must have one argument (line: 1, column: 28)

    class Foo { set bar() {} }
                             ^
    Uncaught exception: [SyntaxError]: Class setter method must have one argument (line: 1, column: 26)

After this change:

    const Foo = { set bar() {} }
                          ^
    Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 23)

    class Foo { set bar() {} }
                        ^
    Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 21)

The only possible downside of this change is that class getters/setters
and functions in objects are not distinguished in the message anymore -
I don't think that's important though, and classes are (mostly) just
syntactic sugar anyway.
2020-10-20 20:27:58 +02:00
Linus Groh
db75be1119 LibJS: Refactor parse_function_node() bool parameters into bit flags
I'm about to add even more options and a bunch of unnamed true/false
arguments is really not helpful. Let's make this a single parse options
parameter using bit flags.
2020-10-20 20:27:58 +02:00
Linus Groh
46cc1f718e LibJS: Unprefixed octal numbers are a syntax error in strict mode 2020-10-19 20:08:22 +02:00
Linus Groh
e898c98873 LibJS: Don't parse arrow function with newline between ) and =>
If there's a newline between the closing paren and arrow it's not a
valid arrow function, ASI should kick in instead (it'll then fail with
"Unexpected token Arrow")
2020-10-19 11:31:55 +02:00
Linus Groh
965d952ff3 LibJS: Share parameter parsing between regular and arrow functions
This simplifies try_parse_arrow_function_expression() and fixes a few
cases that should not produce an arrow function AST but did:

    (a,,) => {}
    (a b) => {}
    (a ...b) => {}
    (...b a) => {}

The new parsing logic checks whether parens are expected and uses
parse_function_parameters() if so, rolling back if a new syntax error
occurs during that. Otherwise it's just an identifier in which case we
parse the single parameter ourselves.
2020-10-19 11:31:55 +02:00
Linus Groh
2dbea60fe2 LibJS: Multiple 'default' clauses in switch statement are a syntax error 2020-10-19 11:30:14 +02:00
Andreas Kling
1d96ecf148 Everywhere: Add missing <AK/TemporaryChange.h> includes
Don't rely on HashTable.h pulling this in.
2020-10-15 23:49:53 +02:00
Matthew Olsson
e8da5f99b1 LibJS: break or continue with nonexistent label is a syntax error 2020-10-08 23:27:16 +02:00
Matthew Olsson
e49ea1b520 LibJS: Disallow 'continue' & 'break' outside of their respective scopes
'continue' is no longer allowed outside of a loop, and an unlabeled
'break' is not longer allowed outside of a loop or switch statement.
Labeled 'break' statements are still allowed everywhere, even if the
label does not exist.
2020-10-08 10:20:49 +02:00
Matthew Olsson
9a82c22a85 LibJS: Disallow 'return' outside of a function 2020-10-08 10:03:21 +02:00
Linus Groh
aa71dae03c LibJS: Implement logical assignment operators (&&=, ||=, ??=)
TC39 proposal, stage 4 as of 2020-07.
https://tc39.es/proposal-logical-assignment/
2020-10-05 17:57:26 +02:00
Linus Groh
f4d0babd5d LibJS: Make assignment to CallExpression a syntax error in strict mode 2020-10-05 09:25:04 +02:00
Linus Groh
283ee678f7 LibJS: Validate all assignment expressions, not just "="
The check for invalid lhs and assignment to eval/arguments in strict
mode should happen for all kinds of assignment expressions, not just
AssignmentOp::Assignment.
2020-10-05 09:25:04 +02:00
Linus Groh
bc701658f8 LibJS: Use String::formatted() for parser error messages 2020-10-04 19:22:02 +02:00
Matthew Olsson
6eb6752c4c LibJS: Strict mode is now handled by Functions and Programs, not Blocks
Since blocks can't be strict by themselves, it makes no sense for them
to store whether or not they are strict. Strict-ness is now stored in
the Program and FunctionNode ASTNodes. Fixes issue #3641
2020-10-04 10:46:12 +02:00
Nico Weber
ef1b21004f Everywhere: Fix typos
Mostly in comments, but sprintf() now prints "August" instead of
"Auguest" so that's something.
2020-10-02 16:03:17 +02:00
Linus Groh
5fd87ccd16 LibJS: Add FIXMEs for parsing increment operators with function LHS/RHS
The parser considers it a syntax error at the moment, other engines
throw a ReferenceError during runtime for ++foo(), --foo(), foo()++ and
foo()--, so I assume the spec defines this.
2020-09-18 20:49:35 +02:00
Ben Wiederhake
db422fa499 LibJS: Avoid unnecessary lambda
Especially when it's evaluated immediately and unconditionally.
2020-08-30 10:31:04 +02:00
Muhammad Zahalqa
5a2ec86048 LibJS: Parser refactored to use constexpr precedence table
Replaced implementation dependent on HashMap with a constexpr PrecedenceTable
based on array lookup.
2020-08-21 16:14:14 +02:00
Nico Weber
ce95628b7f Unicode: Try s/codepoint/code_point/g again
This time, without trailing 's'. Ran:

    git grep -l 'codepoint' | xargs sed -ie 's/codepoint/code_point/g
2020-08-05 22:33:42 +02:00
Nico Weber
19ac1f6368 Revert "Unicode: s/codepoint/code_point/g"
This reverts commit ea9ac3155d.
It replaced "codepoint" with "code_points", not "code_point".
2020-08-05 22:33:42 +02:00
Andreas Kling
ea9ac3155d Unicode: s/codepoint/code_point/g
Unicode calls them "code points" so let's follow their style.
2020-08-03 19:06:41 +02:00