Commit graph

68 commits

Author SHA1 Message Date
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
AnotherTest
8ca0e8325a LibJS: Don't save rule start positions along with the parser state
This fixes #4617.
Also fixes the small problem where some save states would be leaked.
2020-12-29 17:39:42 +01:00
AnotherTest
d0363bca01 LibJS: `save_state()' before creating a RulePosition
Fixes #4617.
2020-12-29 10:51:33 +01:00
AnotherTest
b34b681811 LibJS: Track source positions all the way down to exceptions
This makes exceptions have a trace of source positions too, which could
probably be helpful in making fancier error tracebacks.
2020-12-29 00:58:43 +01:00
Linus Groh
abd49c174a LibJS: Include source location hint in Parser::print_errors() 2020-12-06 18:52:52 +01:00
Andreas Kling
d617120499 LibJS: Parse "with" statements :^) 2020-11-28 17:16:48 +01:00
Linus Groh
39a1c9d827 LibJS: Implement 'new.target'
This adds a new MetaProperty AST node which will be used for
'new.target' and 'import.meta' meta properties. The parser now
distinguishes between "in function context" and "in arrow function
context" (which is required for this).
When encountering TokenType::New we will attempt to parse it as meta
property and resort to regular new expression parsing if that fails,
much like the parsing of labelled statements.
2020-11-02 22:40:59 +01:00
Linus Groh
e07a39c816 LibJS: Replace 'size_t line, size_t column' with 'Optional<Position>'
This is a bit nicer for two reasons:

- The absence of line number/column information isn't based on 'values
  are zero' anymore but on Optional's value
- When reporting syntax errors with position information other than the
  current token's position we had to store line and column ourselves,
  like this:

      auto foo_start_line = m_parser_state.m_current_token.line_number();
      auto foo_start_column = m_parser_state.m_current_token.line_column();
      ...
      syntax_error("...", foo_start_line, foo_start_column);

  Which now becomes:

      auto foo_start= position();
      ...
      syntax_error("...", foo_start);

  This makes it easier to report correct positions for syntax errors
  that only emerge a few tokens later :^)
2020-11-02 22:40:59 +01:00
Linus Groh
9e80c67608 LibJS: Fix "use strict" directive false positives
By having the "is this a use strict directive?" logic in
parse_string_literal() we would apply it to *any* string literal, which
is incorrect and would lead to false positives - e.g.:

    "use strict" + 1
    `"use strict"`
    "\123"; ({"use strict": ...})

Relevant part from the spec which is now implemented properly:

[...] and where each ExpressionStatement in the sequence consists
entirely of a StringLiteral token [...]

I also got rid of UseStrictDirectiveState which is not needed anymore.

Fixes #3903.
2020-11-02 13:13:54 +01:00
Linus Groh
563d3c8055 LibJS: Require initializer for 'const' variable declaration 2020-10-30 23:43:38 +01:00
Linus Groh
dca9e4ec10 LibJS: Implement rules for duplicate function parameters
- A regular function can have duplicate parameters except in strict mode
  or if its parameter list is not "simple" (has a default or rest
  parameter)
- An arrow function can never have duplicate parameters

Compared to other engines I opted for more useful syntax error messages
than a generic "duplicate parameter name not allowed in this context":

    "use strict"; function test(foo, foo) {}
                                     ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34)

    function test(foo, foo = 1) {}
                       ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20)

    function test(foo, ...foo) {}
                          ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23)

    (foo, foo) => {}
          ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)
2020-10-25 12:56:02 +01:00
Linus Groh
4fb96afafc LibJS: Support LegacyOctalEscapeSequence in string literals
https://tc39.es/ecma262/#sec-additional-syntax-string-literals

The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:

Syntax

    EscapeSequence::
        CharacterEscapeSequence
        LegacyOctalEscapeSequence
        NonOctalDecimalEscapeSequence
        HexEscapeSequence
        UnicodeEscapeSequence

    LegacyOctalEscapeSequence::
        OctalDigit [lookahead ∉ OctalDigit]
        ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
        FourToSeven OctalDigit
        ZeroToThree OctalDigit OctalDigit

    ZeroToThree :: one of
        0 1 2 3

    FourToSeven :: one of
        4 5 6 7

    NonOctalDecimalEscapeSequence :: one of
        8 9

This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.

Note

It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:

function invalid() { "\7"; "use strict"; }
2020-10-24 16:34:01 +02:00
Linus Groh
80bb62b9cc LibJS: Distinguish between statement and declaration
This separates matching/parsing of statements and declarations and
fixes a few edge cases where the parser would incorrectly accept a
declaration where only a statement is allowed - for example:

    if (foo) const a = 1;
    for (var bar;;) function b() {}
    while (baz) class c {}
2020-10-23 19:13:06 +02:00
Linus Groh
15642874f3 LibJS: Support all line terminators (LF, CR, LS, PS)
https://tc39.es/ecma262/#sec-line-terminators
2020-10-22 10:06:30 +02:00
Linus Groh
6331d45a6f LibJS: Move checks for invalid getter/setter params to parse_function_node
This allows us to provide better error messages as we can point the
syntax error location to the exact first invalid parameter instead of
always the end of the function within a object literal or class
definition.

Before this change:

    const Foo = { set bar() {} }
                               ^
    Uncaught exception: [SyntaxError]: Object setter property must have one argument (line: 1, column: 28)

    class Foo { set bar() {} }
                             ^
    Uncaught exception: [SyntaxError]: Class setter method must have one argument (line: 1, column: 26)

After this change:

    const Foo = { set bar() {} }
                          ^
    Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 23)

    class Foo { set bar() {} }
                        ^
    Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 21)

The only possible downside of this change is that class getters/setters
and functions in objects are not distinguished in the message anymore -
I don't think that's important though, and classes are (mostly) just
syntactic sugar anyway.
2020-10-20 20:27:58 +02:00
Linus Groh
db75be1119 LibJS: Refactor parse_function_node() bool parameters into bit flags
I'm about to add even more options and a bunch of unnamed true/false
arguments is really not helpful. Let's make this a single parse options
parameter using bit flags.
2020-10-20 20:27:58 +02:00
Linus Groh
46cc1f718e LibJS: Unprefixed octal numbers are a syntax error in strict mode 2020-10-19 20:08:22 +02:00
Linus Groh
965d952ff3 LibJS: Share parameter parsing between regular and arrow functions
This simplifies try_parse_arrow_function_expression() and fixes a few
cases that should not produce an arrow function AST but did:

    (a,,) => {}
    (a b) => {}
    (a ...b) => {}
    (...b a) => {}

The new parsing logic checks whether parens are expected and uses
parse_function_parameters() if so, rolling back if a new syntax error
occurs during that. Otherwise it's just an identifier in which case we
parse the single parameter ourselves.
2020-10-19 11:31:55 +02:00
Matthew Olsson
e8da5f99b1 LibJS: break or continue with nonexistent label is a syntax error 2020-10-08 23:27:16 +02:00
Matthew Olsson
e49ea1b520 LibJS: Disallow 'continue' & 'break' outside of their respective scopes
'continue' is no longer allowed outside of a loop, and an unlabeled
'break' is not longer allowed outside of a loop or switch statement.
Labeled 'break' statements are still allowed everywhere, even if the
label does not exist.
2020-10-08 10:20:49 +02:00
Matthew Olsson
9a82c22a85 LibJS: Disallow 'return' outside of a function 2020-10-08 10:03:21 +02:00
Linus Groh
283ee678f7 LibJS: Validate all assignment expressions, not just "="
The check for invalid lhs and assignment to eval/arguments in strict
mode should happen for all kinds of assignment expressions, not just
AssignmentOp::Assignment.
2020-10-05 09:25:04 +02:00
Linus Groh
bc701658f8 LibJS: Use String::formatted() for parser error messages 2020-10-04 19:22:02 +02:00
Matthew Olsson
6eb6752c4c LibJS: Strict mode is now handled by Functions and Programs, not Blocks
Since blocks can't be strict by themselves, it makes no sense for them
to store whether or not they are strict. Strict-ness is now stored in
the Program and FunctionNode ASTNodes. Fixes issue #3641
2020-10-04 10:46:12 +02:00
Muhammad Zahalqa
5a2ec86048 LibJS: Parser refactored to use constexpr precedence table
Replaced implementation dependent on HashMap with a constexpr PrecedenceTable
based on array lookup.
2020-08-21 16:14:14 +02:00
Jack Karamanian
7533fd8b02 LibJS: Initial class implementation; allow super expressions in object
literal methods; add EnvrionmentRecord fields and methods to
LexicalEnvironment

Adding EnvrionmentRecord's fields and methods lets us throw an exception
when |this| is not initialized, which occurs when the super constructor
in a derived class has not yet been called, or when |this| has already
been initialized (the super constructor was already called).
2020-06-29 17:54:54 +02:00
Matthew Olsson
61ac1d3ffa LibJS: Lex and parse regex literals, add RegExp objects
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
2020-06-07 19:06:55 +02:00
Marcin Gasperowicz
2579d0bf55 LibJS: Hoist function declarations
This patch adds function declaration hoisting. The mechanism
is similar to var hoisting. Hoisted function declarations are to be put
before the hoisted var declarations, hence they have to be treated
separately.
2020-06-06 10:53:06 +02:00
Matthew Olsson
ab576e610c LibJS: Rewrite Parser.parse_object_expression()
This rewrite drastically increases the accuracy of object literals.
Additionally, an "assertIsSyntaxError" function has been added to
test-common.js to assist in testing syntax errors.
2020-06-01 13:11:21 +02:00
Matthew Olsson
10bf4ba3dc LibJS: Parse labelled statements
All statements now have an optional label string that can be null.
2020-05-29 16:20:32 +02:00
Matthew Olsson
786722149b LibJS: Add strict mode
Adds the ability for a scope (either a function or the entire program)
to be in strict mode. Scopes default to non-strict mode.

There are two ways to determine the strict-ness of the JS engine:

1. In the parser, this can be accessed with the parser_state variable
   m_is_strict_mode boolean. If true, the Parser is currently parsing in
   strict mode. This is done so that the Parser can generate syntax
   errors at parse time, which is required in some cases.

2. With Interpreter.is_strict_mode(). This allows strict mode checking
   at runtime as opposed to compile time.

Additionally, in order to test this, a global isStrictMode() function
has been added to the JS ReplObject under the test-mode flag.
2020-05-28 17:18:42 +02:00
Matthew Olsson
cc54974431 LibJS: Fix out-of-range error in Parser::Error::source_location_hint 2020-05-28 17:02:16 +02:00
Linus Groh
2d47b30256 LibJS: Add Error::source_location_hint()
This util function on the Error struct will take the source and then
returns a string like this based on line and column information it has:

foo bar
    ^

Which can be shown in the repl for syntax errors :^)
2020-05-26 14:36:30 +02:00
Linus Groh
07af2e6b2c LibJS: Implement basic for..in and for..of loops 2020-05-25 18:45:36 +02:00
Matthew Olsson
e415dd4e9c LibJS: Handle hex and unicode escape sequences in string literals
Introduces the following syntax:

'\x55'
'\u26a0'
'\u{1f41e}'
2020-05-18 17:58:17 +02:00
Linus Groh
33defef267 LibJS: Let parser keep track of errors
Rather than printing them to stderr directly the parser now keeps a
Vector<Error>, which allows the "owner" of the parser to consume them
individually after parsing.

The Error struct has a message, line number, column number and a
to_string() helper function to format this information into a meaningful
error message.

The Function() constructor will now include an error message when
throwing a SyntaxError.
2020-05-15 09:53:52 +02:00
Linus Groh
00b61a212f LibJS: Remove syntax errors from lexer
Giving the lexer the ability to generate errors adds unnecessary
complexity - also it only calls its syntax_error() function in one place
anyway ("unterminated string literal"). But since the lexer *also* emits
tokens like Eof or UnterminatedStringLiteral, it should be up to the
consumer of these tokens to decide what to do.

Also remove the option to not print errors to stderr as that's not
relevant anymore.
2020-05-15 09:53:52 +02:00
Matthew Olsson
b5f1df57ed LibJS: Add raw strings to tagged template literals
When calling a function with a tagged template, the first array that is
passed in now contains a "raw" property with the raw, escaped strings.
2020-05-07 23:05:55 +02:00
mattco98
adb4accab3 LibJS: Add template literals
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.

When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.

When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.

The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).

TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):

    > `foo
    > bar`
    'foo
    bar'
2020-05-04 16:46:31 +02:00
Matthew Olsson
5e66f1900b LibJS: Add function default arguments
Adds the ability for function arguments to have default values. This
works for standard functions as well as arrow functions. Default values
are not printed in a <function>.toString() call, as nodes cannot print
their source string representation.
2020-05-03 00:44:57 +02:00
Linus Groh
43c1fa9965 LibJS: Implement (no-op) debugger statement 2020-05-01 22:07:13 +02:00
Matthew Olsson
28ef654d13 LibJS: Add object literal method shorthand 2020-05-01 12:28:40 +02:00
Linus Groh
624eaa32af LibJS: Add Parser::syntax_error() helper
Instead of having fprintf()s all over the place we can now use
syntax_error("message") or syntax_error("message", line, column).

This takes care of a consistent format, appending a newline and getting
the line number and column of the current token if the last two params
are omitted.
2020-04-30 08:41:31 +02:00
Linus Groh
038051d205 LibJS: Parse while statements 2020-04-22 11:48:14 +02:00
Stephan Unverwerth
bf5b251684 LibJS: Allow reserved words as keys in object expressions. 2020-04-18 22:23:20 +02:00
Stephan Unverwerth
07f838dc4e LibJS: Implement automatic semicolon insertion 2020-04-17 15:22:31 +02:00
Andreas Kling
ac7459cb40 LibJS: Hoist variable declarations to the nearest relevant scope
"var" declarations are hoisted to the nearest function scope, while
"let" and "const" are hoisted to the nearest block scope.

This is done by the parser, which keeps two scope stacks, one stack
for the current var scope and one for the current let/const scope.

When the interpreter enters a scope, we walk all of the declarations
and insert them into the variable environment.

We don't support the temporal dead zone for let/const yet.
2020-04-13 17:22:23 +02:00
Stephan Unverwerth
984c290ec0 LibJS: Do not execute scripts with parse errors
This adds missing checks in several LibJS consumers.
2020-04-13 10:42:25 +02:00