Commit graph

35 commits

Author SHA1 Message Date
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Linus Groh
03c1d43f6e LibJS: Add message string to Token
This allows us to communicate details about invalid tokens to the parser
without having to invent a bunch of specific invalid tokens like
TokenType::InvalidNumericLiteral.
2020-10-26 21:38:34 +01:00
Linus Groh
4fb96afafc LibJS: Support LegacyOctalEscapeSequence in string literals
https://tc39.es/ecma262/#sec-additional-syntax-string-literals

The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:

Syntax

    EscapeSequence::
        CharacterEscapeSequence
        LegacyOctalEscapeSequence
        NonOctalDecimalEscapeSequence
        HexEscapeSequence
        UnicodeEscapeSequence

    LegacyOctalEscapeSequence::
        OctalDigit [lookahead ∉ OctalDigit]
        ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
        FourToSeven OctalDigit
        ZeroToThree OctalDigit OctalDigit

    ZeroToThree :: one of
        0 1 2 3

    FourToSeven :: one of
        4 5 6 7

    NonOctalDecimalEscapeSequence :: one of
        8 9

This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.

Note

It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:

function invalid() { "\7"; "use strict"; }
2020-10-24 16:34:01 +02:00
Linus Groh
15642874f3 LibJS: Support all line terminators (LF, CR, LS, PS)
https://tc39.es/ecma262/#sec-line-terminators
2020-10-22 10:06:30 +02:00
Linus Groh
aa71dae03c LibJS: Implement logical assignment operators (&&=, ||=, ??=)
TC39 proposal, stage 4 as of 2020-07.
https://tc39.es/proposal-logical-assignment/
2020-10-05 17:57:26 +02:00
Linus Groh
e80217a746 LibJS: Unify syntax highlighting
So far we have three different syntax highlighters for LibJS:

- js's Line::Editor stylization
- JS::MarkupGenerator
- GUI::JSSyntaxHighlighter

This not only caused repetition of most token types in each highlighter
but also a lot of inconsistency regarding the styling of certain tokens:

- JSSyntaxHighlighter was considering TokenType::Period to be an
  operator whereas MarkupGenerator categorized it as punctuation.
- MarkupGenerator was considering TokenType::{Break,Case,Continue,
  Default,Switch,With} control keywords whereas JSSyntaxHighlighter just
  disregarded them
- MarkupGenerator considered some future reserved keywords invalid and
  others not. JSSyntaxHighlighter and js disregarded most

Adding a new token type meant adding it to ENUMERATE_JS_TOKENS as well
as each individual highlighter's switch/case construct.

I added a TokenCategory enum, and each TokenType is now associated to a
certain category, which the syntax highlighters then can use for styling
rather than operating on the token type directly. This also makes
changing a token's category everywhere easier, should we need to do that
(e.g. I decided to make TokenType::{Period,QuestionMarkPeriod}
TokenCategory::Operator for now, but we might want to change them to
Punctuation.
2020-10-04 23:41:31 +02:00
Muhammad Zahalqa
5a2ec86048 LibJS: Parser refactored to use constexpr precedence table
Replaced implementation dependent on HashMap with a constexpr PrecedenceTable
based on array lookup.
2020-08-21 16:14:14 +02:00
Linus Groh
0ff9d7e189 LibJS: Add BigInt 2020-06-07 19:29:40 +02:00
Matthew Olsson
61ac1d3ffa LibJS: Lex and parse regex literals, add RegExp objects
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
2020-06-07 19:06:55 +02:00
Matthew Olsson
e415dd4e9c LibJS: Handle hex and unicode escape sequences in string literals
Introduces the following syntax:

'\x55'
'\u26a0'
'\u{1f41e}'
2020-05-18 17:58:17 +02:00
Linus Groh
1383cd23bc LibJS: Add missing keywords/tokens
Some of these are required for syntax we have not implemented yet, some
are future reserved words in strict mode.
2020-05-12 18:47:38 +02:00
Linus Groh
a2e1f1a872 LibJS: Implement exponentiation assignment operator (**=) 2020-05-05 11:12:27 +02:00
Linus Groh
3e754a15d4 LibJS: Implement bitwise assignment operators (&=, |=, ^=) 2020-05-05 11:12:27 +02:00
mattco98
adb4accab3 LibJS: Add template literals
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.

When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.

When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.

The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).

TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):

    > `foo
    > bar`
    'foo
    bar'
2020-05-04 16:46:31 +02:00
Linus Groh
43c1fa9965 LibJS: Implement (no-op) debugger statement 2020-05-01 22:07:13 +02:00
mattco98
80fecc615a LibJS: Add spreading in array literals
Implement the syntax and behavor necessary to support array literals
such as [...[1, 2, 3]]. A type error is thrown if the target of the
spread operator does not evaluate to an array (though it should
eventually just check for an iterable).

Note that the spread token's name is TripleDot, since the '...' token is
used for two features: spread and rest. Calling it anything involving
'spread' or 'rest' would be a bit confusing.
2020-04-27 11:32:18 +02:00
Linus Groh
95b51e857d LibJS: Add TokenType::TemplateLiteral
This is required for template literals - we're not quite there yet, but at
least the parser can now tell us when this token is encountered -
currently this yields "Unexpected token Invalid". Not really helpful.

The character is a "backtick", but as we already have
TokenType::{StringLiteral,RegexLiteral} this seemed like a fitting name.

This also enables syntax highlighting for template literals in the js
REPL and LibGUI's JSSyntaxHighlighter.
2020-04-24 11:18:57 +02:00
Stephan Unverwerth
bf5b251684 LibJS: Allow reserved words as keys in object expressions. 2020-04-18 22:23:20 +02:00
Stephan Unverwerth
f8f65053bd LibJS: Parse "this" as ThisExpression 2020-04-13 00:45:25 +02:00
Brian Gianforcaro
dd112421b4 LibJS: Plumb line and column information through Lexer / Parser
While debugging test failures, it's pretty frustrating to have to go do
printf debugging to figure out what test is failing right now. While
watching your JS Raytracer stream it seemed like this was pretty
furstrating as well. So I wanted to start working on improving the
diagnostics here.

In the future I hope we can eventually be able to plumb the info down
to the Error classes so any thrown exceptions will contain enough
metadata to know where they came from.
2020-04-05 12:43:39 +02:00
Andreas Kling
9ebd066ac8 LibJS: Add support for "continue" inside "for" statements :^) 2020-04-05 00:22:42 +02:00
Linus Groh
2636cac6e4 LibJS: Remove UndefinedLiteral, add undefined to global object
There is no such thing as a "undefined literal" in JS - undefined is
just a property on the global object with a value of undefined.
This is pretty similar to NaN.

var undefined = "foo"; is a perfectly fine AssignmentExpression :^)
2020-04-03 00:10:52 +02:00
Jack Karamanian
098f1cd0ca LibJS: Add support for arrow functions 2020-03-30 15:41:36 +02:00
Andreas Kling
a1c718e387 LibJS: Use some macro magic to avoid duplicating all the token types 2020-03-30 13:11:07 +02:00
Andreas Kling
1923051c5b LibJS: Lexer and parser support for "switch" statements 2020-03-29 15:03:58 +02:00
Andreas Kling
faddf3a1db LibJS: Implement "throw"
You can now throw an expression to the nearest catcher! :^)

To support throwing arbitrary values, I added an Exception class that
sits as a wrapper around whatever is thrown. In the future it will be
a logical place to store a call stack.
2020-03-24 22:21:58 +01:00
0xtechnobabble
bc002f807a LibJS: Parse object expressions 2020-03-21 10:08:58 +01:00
0xtechnobabble
cfd710eb31 LibJS: Implement null and undefined literals 2020-03-16 13:42:13 +01:00
Stephan Unverwerth
c0e6234219 LibJS: Lex single quote strings, escaped chars and unterminated strings 2020-03-14 12:13:53 +01:00
Stephan Unverwerth
15d5b2d29e LibJS: Add operator precedence parsing
Obey precedence and associativity rules when parsing expressions
with chained operators.
2020-03-14 00:11:24 +01:00
Conrad Pankoff
097e1af4e8 LibJS: Implement for statement 2020-03-12 13:42:23 +01:00
Conrad Pankoff
e88f2f15ee LibJS: Parse === and !== binary operators 2020-03-12 13:42:23 +01:00
Conrad Pankoff
0fe87c5fec LibJS: Implement <= and >= binary operators 2020-03-12 13:42:23 +01:00
Stephan Unverwerth
f3a9eba987 LibJS: Add Javascript lexer and parser
This adds a basic Javascript lexer and parser. It can parse the
currently existing demo programs. More work needs to be done to
turn it into a complete parser than can parse arbitrary JS Code.

The lexer outputs tokens with preceeding whitespace and comments
in the trivia member. This should allow us to generate the exact
source code by concatenating the generated tokens.

The parser is written in a way that it always returns a complete
syntax tree. Error conditions are represented as nodes in the
tree. This simplifies the code and allows it to be used as an
early stage parser, e.g for parsing JS documents in an IDE while
editing the source code.:
2020-03-12 09:25:49 +01:00