In this patch only top level and not the more complicated for loop using
statements are supported. Also, as noted in the latest meeting of tc39
async parts of the spec are not stage 3 thus not included.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
Vectors that stick around in the AST were wasting a fair bit of memory
due to the growth padding we keep by default. This patch goes after some
of these vectors with the shrink_to_fit() stick to reduce waste.
Since the AST can stay around for a long time, it is worth making an
effort to shrink it down when we have a chance.
Instead of CallExpression storing its arguments in a Vector<Argument>,
we now custom-allocate the memory slot for CallExpression (and its
subclass NewExpression) so that it fits both CallExpression and its list
of Arguments in one allocation.
This reduces memory usage on twitter.com/awesomekling by 8.8 MiB :^)
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Although not quite like the spec says the web reality is that a lhs
target of CallExpression should not give a SyntaxError but only a
ReferenceError once executed.
This was state only used by the parser to output an error with
appropriate location. This shrinks the size of ObjectExpression from
120 bytes down to just 56. This saves roughly 2.5 MiB when loading
twitter.
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
Before this could give the `must be followed by in` error before the
undeclared private identifier error. Fixing the `in` error would not
have resolved the other error so this order makes the errors more
actionable.
If we don't check that a private identifier is valid this can break the
assumption that we have a private environment when evaluation the
private identifier. Also an unknown private identifier this should
be a SyntaxError.
This requires a special case with names as the default function is
supposed to have a unique name ("*default*" in our case) but when
checked should have name "default".
This is an export which looks like `export {} from "module"`, and
although it doesn't have any real export entries it should still add
"module" to the required modules to load.
We already did this but it called the @@iterator method of
%Array.prototype% visible to the user for example by overriding that
method. This should not be visible so we use a special version of
SuperCall now.
Since tagged template literals can inspect the raw string it is not a
syntax error to have invalid escapes. However the cooked value should be
`undefined`.
We accomplish this by tracking whether parse_string_literal
fails and then using a NullLiteral (since UndefinedLiteral is not a
thing) and finally converting null in tagged template execution to
undefined.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
While adding spec comments to PerformEval, I noticed we were missing
multiple steps.
Namely, these were:
- Checking if the host will allow us to compile the string
(allowing LibWeb to perform CSP for eval)
- The parser's initial state depending on the environment around us
on direct eval:
- Allowing new.target via eval in functions
- Allowing super calls and super properties via eval in classes
- Disallowing the use of the arguments object in class field
initializers at eval's parse time
- Setting ScriptOrModule of eval's execution context
The spec allows us to apply the additional parsing steps in any order.
The method I have gone with is passing in a struct to the parser's
constructor, which overrides the parser's initial state to (dis)allow
the things stated above from the get-go.
The same expression is not allowed to contain both the
logical && and || operators, and the coalescing ?? operator.
This patch changes how "forbidden" tokens are handled, using a
finite set instead of an Vector. This supports much more efficient
merging of the forbidden tokens when propagating forward, and
allowing the return of forbidden tokens to parent contexts.
This patch makes check_identifier_name_for_assignment_validity()
take a FlyString instead of a StringView. We then exploit this by
passing FlyString in more places via flystring_value().
This gives a ~1% speedup when parsing the largest Discord JS file.
When parsing identifiers, we ultimately want to sink the token string
into a FlyString anyway, and since Token may have a FlyString already
inside it, this allows us to bypass the costly FlyString(StringView).
This gives a ~3% speedup when parsing the largest Discord JS file.
Everyone who calls this already has a FlyString, so we were doing *way*
more work by pessimizing it to a StringView.
This gives a ~2% speedup when parsing the largest Discord JS file.
This removes a number of vm.exception() checks which are now caught
directly by TRY. Make use of these checks in
{Global, Eval}DeclarationInstantiation and while we're here add spec
comments.
Because we can have arbitrary in- and export names with strings we can
have '*' and '' which means using '*' as an indicating namespace imports
failed / behaved incorrectly for string imports '*'.
We now use more specific types to indicate these special states instead
of these 'magic' string values.
Do note that 'default' is not actually a magic string value but one
specified by the spec. And you can in fact export the default value by
doing: `export { 1 as default }`.
The big changes are:
- Allow strings as Module{Export, Import}Name
- Properly track declarations in default export statements
However, the spec is a little strange in that it allows function and
class declarations without a name in default export statements.
This is quite hard to fully implement without rewriting more of the
parser so for now this behavior is emulated by faking things with
function and class expressions. See the comments in
parse_export_statement for details on the hacks and where it goes wrong.