This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
This required setting things up so that all function objects can plop
a PrimitiveString there instead of an AK string.
This is a step towards making ExecutionContext easier to allocate.
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
We were trying to stringify the stack trace without the last element,
leading to a loop bound of (size_t)(0 - 1) and accessing m_traceback[0]
out-of-bounds.
Instead, just return an empty string in that case.
Fixes#21747
The previous implementation was calling `backtrace()` for every
function call, which is quite slow.
Instead, this implementation provides VM::stack_trace() which unwinds
the native stack, maps it through NativeExecutable::get_source_range
and combines it with source ranges from interpreted call frames.
Previously these handlers duplicated code and used formats that
were different from the one Error.prototype.stack uses.
Now they use the same Error::stack_string function, which accepts
a new parameter for compacting stack traces with repeating frames.
This works by adding source start/end offset to every bytecode
instruction. In the future we can make this more efficient by keeping
a map of bytecode ranges to source ranges in the Executable instead,
but let's just get traces working first.
Co-Authored-By: Andrew Kaster <akaster@serenityos.org>
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
This loosens the connection to the AST interpreter and will allow us to
generate SourceRanges for the Bytecode interpreter in the future as well
Moves UnrealizedSourceRanges from TracebackFrame to the JS namespace for
this
Made a slight logic error in 95d69fc which meant the dummy range would
be returned even if the source_range_storage contained an actual source
range. This corrects that by resolving the null unrealized range to a
dummy range, and storing that. It then can be treated as a normal source
range.
Previously, source_range() could crash attempting to read from a null
unrealized->source_code pointer. It looks like the previous behaviour
here was to return a dummy source range, so this commit restores that.
With this loading https://github.com/SerenityOS/serenity works again.
Instead of eagerly populating the stack trace with a textual
representation of every call frame, just store the raw source code range
(code, start offset, end offset). From that, we can generate the full
rich backtrace when requested, and save ourselves the trouble otherwise.
This makes test-wasm take ~7 seconds on my machine instead of ~60. :^)
Some of these are allocated upon initialization of the intrinsics, and
some lazily, but in neither case the getters actually return a nullptr.
This saves us a whole bunch of pointer dereferences (as NonnullGCPtr has
an `operator T&()`), and also has the interesting side effect of forcing
us to explicitly use the FunctionObject& overload of call(), as passing
a NonnullGCPtr is ambigous - it could implicitly be turned into a Value
_or_ a FunctionObject& (so we have to dereference manually).
This includes an Error::create overload to create an Error from a UTF-8
StringView. If creating a String from that view fails, the factory will
return an OOM InternalError instead. VM::throw_completion can also make
use of this overload via its perfect forwarding.
Having an alias function that only wraps another one is silly, and
keeping the more obvious name should flush out more uses of deprecated
strings.
No behavior change.
This constructor was easily confused with a copy constructor, and it was
possible to accidentally copy-construct Objects in at least one way that
we dicovered (via generic ThrowCompletionOr construction).
This patch adds a mandatory ConstructWithPrototypeTag parameter to the
constructor to disambiguate it.
Note that js_rope_string() has been folded into this, the old name was
misleading - it would not always create a rope string, only if both
sides are not empty strings. Use a three-argument create() overload
instead.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
This is a continuation of the previous three commits.
Now that create() receives the allocating realm, we can simply forward
that to allocate(), which accounts for the majority of these changes.
Additionally, we can get rid of the realm_from_global_object() in one
place, with one more remaining in VM::throw_completion().
This is a continuation of the previous two commits.
As allocating a JS cell already primarily involves a realm instead of a
global object, and we'll need to pass one to the allocate() function
itself eventually (it's bridged via the global object right now), the
create() functions need to receive a realm as well.
The plan is for this to be the highest-level function that actually
receives a realm and passes it around, AOs on an even higher level will
use the "current realm" concept via VM::current_realm() as that's what
the spec assumes; passing around realms (or global objects, for that
matter) on higher AO levels is pointless and unlike for allocating
individual objects, which may happen outside of regular JS execution, we
don't need control over the specific realm that is being used there.
This removes all usages of the non-standard define_property helper
method and replaces all it's usages with the specification required
alternative or with define_direct_property where appropriate.
The fact that they *are* subclasses is an implementation detail and
should not be highlighted. The spec calls these NativeErrors, so let's
use that.
Also added a comment explaining *why* they inherit from Error - I was
about to change that :^)
20.5.1.1 Error ( message )
When the Error function is called with argument message, the
following steps are taken:
[...]
3b. Let msgDesc be the PropertyDescriptor {
[[Value]]: msg,
[[Writable]]: true,
[[Enumerable]]: false,
[[Configurable]]: true
}.
3c. Perform ! DefinePropertyOrThrow(O, "message", msgDesc).
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The previous handling of the name and message properties specifically
was breaking websites that created their own error types and relied on
the error prototype working correctly - not assuming an JS::Error this
object, that is.
The way it works now, and it is supposed to work, is:
- Error.prototype.name and Error.prototype.message just have initial
string values and are no longer getters/setters
- When constructing an error with a message, we create a regular
property on the newly created object, so a lookup of the message
property will either get it from the object directly or go though the
prototype chain
- Internal m_name/m_message properties are no longer needed and removed
This makes printing errors slightly more complicated, as we can no
longer rely on the (safe) internal properties, and cannot trust a
property lookup either - get_without_side_effects() is used to solve
this, it's not perfect but something we can revisit later.
I did some refactoring along the way, there was some really old stuff in
there - accessing vm.call_frame().arguments[0] is not something we (have
to) do anymore :^)
Fixes#6245.