This change speeds up the process of checking variable declaration
within the scope by replacing the iteration through all declared
variables (without the ability to exit early from the loop) with hash
map lookups.
This change cuts google maps loading by 17s on my computer.
Previously, the usage of local variables was limited for all function
declarations. This change relaxes the restriction and only prohibits
locals for hoistable annexB declarations.
Initially, the usage of local variables for parameters had to be
disabled when default values were used because there was a bug that
didn't allow to correctly find the scope to which identifiers used
within the default parameter expression belonged, and hence correctly
identify if a variable can be local. However, this bug was fixed in
2f85faef0f, so now this restriction
is no longer needed.
This change fixes an issue where identifiers used in default function
parameters were being "registered" in the function's parent scope
instead of its own scope. This bug resulted in incorrectly detected
local variables. (Variables used in the default function parameter
expression should be considered 'captured by nested function'.)
To resolve this issue, the function scope is now created before parsing
function parameters. Since function parameters can no longer be passed
in the constructor, a setter function has been introduced to set them
later, when they are ready.
Using local variables to store function parameters makes Kraken tests
run 7-10% faster.
For now this optimization is limited to only be applied if:
- Parameter does not use destructuring assignment
- None of the function params has default value
- There is no access to "arguments" variable inside function body
This modification enables the parser to determine whether an identifier
used within a function refers to a local variable or not. In this
context, a local identifier means that it is not captured by any nested
function declaration which means it modified only from inside a
function.
The information about whether identifier is local is stored inside
Identifier AST node and also contains information about the index of
local variable inside a function and information about total number
of local variables used by a function is stored in function nodes.
By using Identifier class to represent the name of a class expression,
it becomes possible to consistently store information within the
identifier object, indicating whether the name refers to a local
variable or not.
The flag indicating the presence of an await expression should be
passed up to the parent scope until the nearest function scope is
reached. This resolves several problems related to identifying
top-level awaits, which are currently not recognized correctly
when used within a nested scope.
Previously we were unable to parse code like `yield/2` because `/2`
was parsed as a regex. At the same time `for (a in / b/)` was parsed
as a division.
This is solved by defaulting to division in the lexer, but calling
`force_slash_as_regex()` from the parser whenever an IdentifierName
is parsed as a ReservedWord.
Instead of passing the continuously merged initial forbidden token set
(with the new additional forbidden tokens from each parsed secondary
expression) to the next call of parse_secondary_expression(), keep a
copy of the original set and use it as the base for parsing the next
secondary expression.
This bug prevented us from properly parsing the following expression:
```js
0 ?? 0 ? 0 : 0 || 0
```
...due to LogicalExpression with LogicalOp::NullishCoalescing returning
both DoubleAmpersand and DoublePipe in its forbidden token set.
The following correct AST is now generated:
Program
(Children)
ExpressionStatement
ConditionalExpression
(Test)
LogicalExpression
NumericLiteral 0
??
NumericLiteral 0
(Consequent)
NumericLiteral 0
(Alternate)
LogicalExpression
NumericLiteral 0
||
NumericLiteral 0
An alternate solution I explored was only merging the original forbidden
token set with the one of the last parsed secondary expression which is
then passed to match_secondary_expression(); however that led to an
incorrect AST (note the alternate expression):
Program
(Children)
ExpressionStatement
LogicalExpression
ConditionalExpression
(Test)
LogicalExpression
NumericLiteral 0
??
NumericLiteral 0
(Consequent)
NumericLiteral 0
(Alternate)
NumericLiteral 0
||
NumericLiteral 0
Truth be told, I don't know enough about the inner workings of the
parser to fully explain the difference. AFAICT this patch has no
unintended side effects in its current form though.
Fixes#18087.
This class had slightly confusing semantics and the added weirdness
doesn't seem worth it just so we can say "." instead of "->" when
iterating over a vector of NNRPs.
This patch replaces NonnullRefPtrVector<T> with Vector<NNRP<T>>.
In this patch only top level and not the more complicated for loop using
statements are supported. Also, as noted in the latest meeting of tc39
async parts of the spec are not stage 3 thus not included.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
Vectors that stick around in the AST were wasting a fair bit of memory
due to the growth padding we keep by default. This patch goes after some
of these vectors with the shrink_to_fit() stick to reduce waste.
Since the AST can stay around for a long time, it is worth making an
effort to shrink it down when we have a chance.
Instead of CallExpression storing its arguments in a Vector<Argument>,
we now custom-allocate the memory slot for CallExpression (and its
subclass NewExpression) so that it fits both CallExpression and its list
of Arguments in one allocation.
This reduces memory usage on twitter.com/awesomekling by 8.8 MiB :^)
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Although not quite like the spec says the web reality is that a lhs
target of CallExpression should not give a SyntaxError but only a
ReferenceError once executed.
This was state only used by the parser to output an error with
appropriate location. This shrinks the size of ObjectExpression from
120 bytes down to just 56. This saves roughly 2.5 MiB when loading
twitter.
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
Before this could give the `must be followed by in` error before the
undeclared private identifier error. Fixing the `in` error would not
have resolved the other error so this order makes the errors more
actionable.
If we don't check that a private identifier is valid this can break the
assumption that we have a private environment when evaluation the
private identifier. Also an unknown private identifier this should
be a SyntaxError.
This requires a special case with names as the default function is
supposed to have a unique name ("*default*" in our case) but when
checked should have name "default".
This is an export which looks like `export {} from "module"`, and
although it doesn't have any real export entries it should still add
"module" to the required modules to load.
We already did this but it called the @@iterator method of
%Array.prototype% visible to the user for example by overriding that
method. This should not be visible so we use a special version of
SuperCall now.
Since tagged template literals can inspect the raw string it is not a
syntax error to have invalid escapes. However the cooked value should be
`undefined`.
We accomplish this by tracking whether parse_string_literal
fails and then using a NullLiteral (since UndefinedLiteral is not a
thing) and finally converting null in tagged template execution to
undefined.