Commit graph

212 commits

Author SHA1 Message Date
Aliaksandr Kalenik
2f85faef0f LibJS: Fix scope detection for ids in default function params
This change fixes an issue where identifiers used in default function
parameters were being "registered" in the function's parent scope
instead of its own scope. This bug resulted in incorrectly detected
local variables. (Variables used in the default function parameter
expression should be considered 'captured by nested function'.)

To resolve this issue, the function scope is now created before parsing
function parameters. Since function parameters can no longer be passed
in the constructor, a setter function has been introduced to set them
later, when they are ready.
2023-07-08 14:03:12 +02:00
Aliaksandr Kalenik
b1af91d8c4 LibJS: Use local variables to store function parameters in some cases
Using local variables to store function parameters makes Kraken tests
run 7-10% faster.

For now this optimization is limited to only be applied if:
- Parameter does not use destructuring assignment
- None of the function params has default value
- There is no access to "arguments" variable inside function body
2023-07-07 19:35:08 +02:00
Aliaksandr Kalenik
2e81cc4cf7 LibJS: Use Identifier to represent FunctionParameter name
Using identifier instead of string allows to store supplemental
information about whether it can be represented as local variable.
2023-07-07 19:35:08 +02:00
Aliaksandr Kalenik
01910bca39 LibJS: Null check current scope pusher before register_identifier call
This fixes crashing when current scope pusher is null during identifier
parsing.
2023-07-05 21:21:09 +02:00
Aliaksandr Kalenik
380abddf3c LibJS: Update parser to detect if identifier refer a "local" variable
This modification enables the parser to determine whether an identifier
used within a function refers to a local variable or not. In this
context, a local identifier means that it is not captured by any nested
function declaration which means it modified only from inside a
function.

The information about whether identifier is local is stored inside
Identifier AST node and also contains information about the index of
local variable inside a function and information about total number
of local variables used by a function is stored in function nodes.
2023-07-05 21:03:01 +02:00
Aliaksandr Kalenik
c734f2b5e6 LibJS: Use Identifier to represent name of ClassExpression
By using Identifier class to represent the name of a class expression,
it becomes possible to consistently store information within the
identifier object, indicating whether the name refers to a local
variable or not.
2023-07-05 21:03:01 +02:00
Aliaksandr Kalenik
75ae368896 LibJS: Propagate "contains await" flag to parent scope in ScopePusher
The flag indicating the presence of an await expression should be
passed up to the parent scope until the nearest function scope is
reached. This resolves several problems related to identifying
top-level awaits, which are currently not recognized correctly
when used within a nested scope.
2023-07-05 06:05:22 +02:00
Malik Ammar Faisal
5c913d9cc4 LibJS: Correctly handle parentheses and new Object
Parses `new Object()?.foo`, `(new Object)?.foo`
and shows syntax error on `new Object?.foo`
2023-06-17 20:01:38 +02:00
Simon Wanner
a2efecac03 LibJS: Parse slashes after reserved identifiers correctly
Previously we were unable to parse code like `yield/2` because `/2`
was parsed as a regex. At the same time `for (a in / b/)` was parsed
as a division.

This is solved by defaulting to division in the lexer, but calling
`force_slash_as_regex()` from the parser whenever an IdentifierName
is parsed as a ReservedWord.
2023-06-10 07:20:33 +02:00
Linus Groh
3709d11212 LibJS: Parse secondary expressions with the original forbidden token set
Instead of passing the continuously merged initial forbidden token set
(with the new additional forbidden tokens from each parsed secondary
expression) to the next call of parse_secondary_expression(), keep a
copy of the original set and use it as the base for parsing the next
secondary expression.

This bug prevented us from properly parsing the following expression:

```js
0 ?? 0 ? 0 : 0 || 0
```

...due to LogicalExpression with LogicalOp::NullishCoalescing returning
both DoubleAmpersand and DoublePipe in its forbidden token set.

The following correct AST is now generated:

Program
  (Children)
    ExpressionStatement
      ConditionalExpression
        (Test)
          LogicalExpression
            NumericLiteral 0
            ??
            NumericLiteral 0
        (Consequent)
          NumericLiteral 0
        (Alternate)
          LogicalExpression
            NumericLiteral 0
            ||
            NumericLiteral 0

An alternate solution I explored was only merging the original forbidden
token set with the one of the last parsed secondary expression which is
then passed to match_secondary_expression(); however that led to an
incorrect AST (note the alternate expression):

Program
  (Children)
    ExpressionStatement
      LogicalExpression
        ConditionalExpression
          (Test)
            LogicalExpression
              NumericLiteral 0
              ??
              NumericLiteral 0
          (Consequent)
            NumericLiteral 0
          (Alternate)
            NumericLiteral 0
        ||
        NumericLiteral 0

Truth be told, I don't know enough about the inner workings of the
parser to fully explain the difference. AFAICT this patch has no
unintended side effects in its current form though.

Fixes #18087.
2023-04-02 06:45:37 +02:00
Andreas Kling
8a48246ed1 Everywhere: Stop using NonnullRefPtrVector
This class had slightly confusing semantics and the added weirdness
doesn't seem worth it just so we can say "." instead of "->" when
iterating over a vector of NNRPs.

This patch replaces NonnullRefPtrVector<T> with Vector<NNRP<T>>.
2023-03-06 23:46:35 +01:00
Luke Wilde
f4be95af69 LibJS: Don't discard ThrowCompletionOr<void> from declaration iteration 2023-02-27 23:57:08 +00:00
Andreas Kling
bd5d8e9d35 LibJS: Make RefPtr and NonnullRefPtr usage const-correct
This mainly affected the AST, which is now const throughout.
2023-02-21 00:54:04 +01:00
Evan Smal
3226ce3d83 LibJS: Remove some usage of DeprecatedString usage from Lexer
This changes the filename member from DeprecatedString to String. Parser
has also been updated to meet the updated Lexer interface.
2023-01-26 20:25:25 +00:00
Evan Smal
cfa6b4d815 LibJS: Remove DeprecatedString usage from Token 2023-01-26 20:25:25 +00:00
Evan Smal
93674e4383 LibJS: Remove DeprecatedString usage from SourceCode
This change also requires updates to some users of the SourceCode
interface since it no longer use DeprecatedString.
2023-01-26 20:25:25 +00:00
davidot
bff038411a LibJS: Add using declaration support in for and for of loops
The using declarations have kind of special behavior in for loops so
this is seperated.
2023-01-23 09:56:50 +00:00
davidot
541637e15a LibJS: Add using declaration support, RAII like operation in js
In this patch only top level and not the more complicated for loop using
statements are supported. Also, as noted in the latest meeting of tc39
async parts of the spec are not stage 3 thus not included.
2023-01-23 09:56:50 +00:00
Timothy Flynn
f3db548a3d AK+Everywhere: Rename FlyString to DeprecatedFlyString
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
2023-01-09 23:00:24 +00:00
davidot
2bbea62176 LibJS: Don't update names of resulting functions in object expression
The only cases where the name should be set is if the function comes
from a direct anonymous function expression.
2022-12-14 15:27:08 +00:00
Andreas Kling
9721da2e6a LibJS: Call shrink_to_fit() on various Vectors created during parse
Vectors that stick around in the AST were wasting a fair bit of memory
due to the growth padding we keep by default. This patch goes after some
of these vectors with the shrink_to_fit() stick to reduce waste.

Since the AST can stay around for a long time, it is worth making an
effort to shrink it down when we have a chance.
2022-12-08 23:36:17 +00:00
Andreas Kling
b894acd6b2 LibJS: Make one compact allocation for CallExpression and its Arguments
Instead of CallExpression storing its arguments in a Vector<Argument>,
we now custom-allocate the memory slot for CallExpression (and its
subclass NewExpression) so that it fits both CallExpression and its list
of Arguments in one allocation.

This reduces memory usage on twitter.com/awesomekling by 8.8 MiB :^)
2022-12-08 23:36:17 +00:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
davidot
d218a68296 LibJS: Allow CallExpressions as lhs of assignments in most cases
Although not quite like the spec says the web reality is that a lhs
target of CallExpression should not give a SyntaxError but only a
ReferenceError once executed.
2022-11-30 08:05:37 +01:00
davidot
2c26ee89ac LibJS: Remove m_first_invalid_property_range from ObjectExpression
This was state only used by the parser to output an error with
appropriate location. This shrinks the size of ObjectExpression from
120 bytes down to just 56. This saves roughly 2.5 MiB when loading
twitter.
2022-11-27 12:31:37 +01:00
davidot
3acbd96851 LibJS: Remove is_use_strict_directive for all StringLiterals
This value was only used in the parser so no need to have this in every
string literal in the ast.
2022-11-27 12:31:37 +01:00
Andreas Kling
d16fab5815 LibJS: Avoid unnecessary SourceRange construction in parse_program()
This takes `test-js` runtime from 4.3 sec to 4.1 sec on my machine.
2022-11-24 16:06:20 +00:00
Andreas Kling
835d7aac96 LibJS: Make FunctionNode::Parameter be a standalone FunctionParameter
This will allow us to forward declare it and avoid including AST.h in a
number of places.
2022-11-23 16:05:59 +00:00
Andreas Kling
e0916dbb35 LibJS: Move {Import,Export}Entry out of {Import,Export}Statement
By making these be standalone instead of nested structs, we can forward
declare them. This will allow us to stop including AST.h in some places.
2022-11-23 16:05:59 +00:00
Andreas Kling
76f438eb3e LibJS: Remove unused "lexical argument index" metadata from Identifier
This shrinks Identifier by 16 bytes. :^)
2022-11-22 21:13:35 +01:00
Andreas Kling
b0b022507b LibJS: Reduce AST memory usage by shrink-wrapping source range info
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:

    filename:       StringView (16 bytes)
    start:          Position (24 bytes)
    end:            Position (24 bytes)

The Position structs have { line, column, offset }, all members size_t.

To reduce memory consumption, AST nodes now only store the following:

    source_code:    NonnullRefPtr<SourceCode> (8 bytes)
    start_offset:   u32 (4 bytes)
    end_offset:     u32 (4 bytes)

SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.

The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.

With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
2022-11-22 21:13:35 +01:00
davidot
73fcbbb0ee LibJS: Give the undeclared private identifier error more precedence
Before this could give the `must be followed by in` error before the
undeclared private identifier error. Fixing the `in` error would not
have resolved the other error so this order makes the errors more
actionable.
2022-11-17 16:05:20 +00:00
davidot
16ac43c9d4 LibJS: Make sure private identifier is valid in optional chain
If we don't check that a private identifier is valid this can break the
assumption that we have a private environment when evaluation the
private identifier. Also an unknown private identifier this should
be a SyntaxError.
2022-11-17 16:05:20 +00:00
davidot
5ca6e8dca8 LibJS: No longer hoist if parent scope has a function with the same name 2022-11-17 16:05:20 +00:00
davidot
67865306d3 LibJS: Fix that functions in module did not look for var declarations 2022-11-15 12:00:36 +00:00
davidot
9f661d20f7 LibJS: Allow anonymous functions as default exports
This requires a special case with names as the default function is
supposed to have a unique name ("*default*" in our case) but when
checked should have name "default".
2022-09-02 02:07:37 +01:00
davidot
faf1430ce4 LibJS: Allow exporting any imported bindings 2022-09-02 02:07:37 +01:00
davidot
3b1c3e574f LibJS: Handle empty named export
This is an export which looks like `export {} from "module"`, and
although it doesn't have any real export entries it should still add
"module" to the required modules to load.
2022-09-02 02:07:37 +01:00
davidot
f75c51b097 LibJS: Allow full ModuleExportName in namespace
This means we should accept a string after 'export * as '.
2022-09-02 02:07:37 +01:00
davidot
fce2b33758 LibJS: Allow BigInts as destructuring property names
These are simply treated as their numerical value which means that above
2^32 - 1 they are strings.
2022-08-24 23:27:17 +01:00
davidot
ae349ec6a8 LibJS: Use a synthetic constructor if class with parent doesn't have one
We already did this but it called the @@iterator method of
%Array.prototype% visible to the user for example by overriding that
method. This should not be visible so we use a special version of
SuperCall now.
2022-08-20 23:53:55 +01:00
davidot
e5adc51e27 LibJS: Allow invalid string in tagged template literals
Since tagged template literals can inspect the raw string it is not a
syntax error to have invalid escapes. However the cooked value should be
`undefined`.
We accomplish this by tracking whether parse_string_literal
fails and then using a NullLiteral (since UndefinedLiteral is not a
thing) and finally converting null in tagged template execution to
undefined.
2022-08-17 23:56:24 +01:00
Ali Mohammad Pur
f4b26b0cea LibJS: Hook up the 'v' (unicodeSets) RegExp flag 2022-07-20 21:25:59 +01:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Daniel Bertalan
ebac8abc04 LibJS: Explicitly instantiate Parser::parse_function_node
Due to macOS visibility rules, this function did not end up being
exported from liblagom-js.dylib, causing LagomWeb to fail to link.
2022-07-04 21:46:02 +02:00
Linus Groh
5a26a547db LibJS: Update a couple of outdated spec comments
These are editorial changes in the ECMA-262 spec.

See:
- https://github.com/tc39/ecma262/commit/e080a7f
- https://github.com/tc39/ecma262/commit/c5a9094
- https://github.com/tc39/ecma262/commit/5091520
- https://github.com/tc39/ecma262/commit/1c6564b
- https://github.com/tc39/ecma262/commit/e06c80c
2022-05-01 22:47:38 +02:00
Luke Wilde
34f902fb52 LibJS: Add missing steps and spec comments to PerformEval
While adding spec comments to PerformEval, I noticed we were missing
multiple steps.

Namely, these were:
- Checking if the host will allow us to compile the string
  (allowing LibWeb to perform CSP for eval)
- The parser's initial state depending on the environment around us
  on direct eval:
   - Allowing new.target via eval in functions
   - Allowing super calls and super properties via eval in classes
   - Disallowing the use of the arguments object in class field
     initializers at eval's parse time
- Setting ScriptOrModule of eval's execution context

The spec allows us to apply the additional parsing steps in any order.
The method I have gone with is passing in a struct to the parser's
constructor, which overrides the parser's initial state to (dis)allow
the things stated above from the get-go.
2022-04-11 21:23:36 +01:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Idan Horowitz
7ebb421ee9 LibJS: Implement the object literal __proto__ property key special case 2022-03-06 01:38:25 +02:00