for_each_cell_among_possible_pointers() was taking HashTable by value
instead of by const reference for no reason.
The copying was soaking up ~4% of CPU time while loading https://x.com/
This change introduces HeapFunction, which is intended to be used as a
replacement for SafeFunction. The new type behaves like a regular
GC-allocated object, which means it needs to be visited from
visit_edges, and unlike SafeFunction, it does not create new roots for
captured parameters.
Co-Authored-By: Andreas Kling <kling@serenityos.org>
This change introduces a very basic GC graph dumper. The `dump_graph()`
function outputs JSON data that contains information about all nodes in
the graph, including their class types and edges.
Root nodes will have a property indicating their root type or source
location if the root is captured by a SafeFunction. It would be useful
to add source location for other types of roots in the future.
Output JSON dump have following format:
```json
"4908721208": {
"class_name": "Accessor",
"edges": [
"4909298232",
"4909297976"
]
},
"4907520440": {
"root": "SafeFunction Optional Optional.h:137",
"class_name": "Realm",
"edges": [
"4908269624",
"4924821560",
"4908409240",
"4908483960",
"4924527672"
]
},
"4908251320": {
"class_name": "CSSStyleRule",
"edges": [
"4908302648",
"4925101656",
"4908251192"
]
},
```
This patch triggers the collector when allocated memory doubles instead
of every 100k allocations. Which can almost half (reduce by ~48%) the
time spent on collection when loading google-maps.
This dynamic approach is inspired by some other GCs like Golang's and
Lua's and improves performance in memory heavy applications because
marking must visit old objects which will dominate the marking phase if
the GC is invoked too often.
This commit also improves the Octane Splay benchmark and almost
doubles it :^)
The Heap::uproot_cell() API was used to implement markAsGarbage() which
was used in 3 tests to forcibly destroy a value, even if it had
references on the stack or elsewhere.
This patch rewrites the 3 tests that used this mechanism to be
structured in a way that allows garbage collection to collect the values
as intended without hacks. And now that the uprooting mechanism is no
longer needed, it's uprooted as well.
This fixes 3 test-js tests in bytecode mode. :^)
Cell::heap() and Cell::vm() needed to access member functions from
HeapBlock, and wanted to be inline, so they were moved to VM.h.
That approach will no longer work with VM.h not being included in every
file (starting from the next commit), so this commit fixes that circular
import issue by introducing secondary base classes to host the
references to Heap and VM, respectively.
Since the relationship between VM and Bytecode::Interpreter is now
clear, we can have VM ask the Interpreter for roots in the GC marking
pass. This avoids having to register and unregister handles and
MarkedVectors over and over.
Since GeneratorObject can also own a RegisterWindow, we share the code
in a RegisterWindow::visit_edges() helper.
~4% speed-up on Kraken/stanford-crypto-ccm.js :^)
This is a similar strategy to what v8 does. Use the ASAN API function
__asan_addr_is_in_fake_stack to check any fake stack frames associated
with each stack address we scan. This fully allows running test-js -g
with the option detect_stack_use_after_return turned on.
That's what this class really is; in fact that's what the first line of
the comment says it is.
This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
We were accidentally skipping over most of the CPU registers by
incrementing the register index by sizeof(FlatPtr) instead of 1.
This fixes a long-standing issue where live objects could still get
garbage-collected if they were only pointed to by an unlucky register.
This will temporarily bloat the size of PrimitiveString as LibJS is
transitioned to use String throughout, but will make doing so piecemeal
much easier.
We don't need to be checking the current time unconditionally when we
only observe the results if we're going to dump the GC stats.
This saves two trips to clock_gettime at the cost of an extra branch.
Doing things in the destructor of a GC-allocated object isn't always
safe, in case it involves accessing other GC-allocated objects.
If they were already swept by GC, we'd be poking into freed memory.
This patch adds a separate finalization pass where GC calls finalize()
on every unmarked cell that's about to be deleted.
It's safe to access other GC objects in finalize(), even if they're
also unmarked.
SafeFunction automatically registers its closure memory area in a place
where the JS garbage collector can find it.
This means that you can capture JS::Value and arbitrary pointers into
the GC heap in closures, as long as you're using a SafeFunction, and the
GC will not zap those values!
There's probably some performance impact from this, and there's a lot of
things that could be nicer/smarter about it, but let's build something
that ensures safety first, and we can worry about performance later. :^)
JS::Value stores 48 bit pointers to separately allocated objects in its
payload. On x86-64, canonical addresses have their top 16 bits set to
the same value as bit 47, effectively meaning that the value has to be
sign-extended to get the pointer. AArch64, however, expects the topmost
bits to be all zeros.
This commit gates sign extension behind `#if ARCH(X86_64)`, and adds an
`#error` for unsupported architectures, so that we do not forget to
think about pointer handling when porting to a new architecture.
Fixes#15290FixesSerenityOS/ladybird#56
This is a continuation of the previous five commits.
A first big step into the direction of no longer having to pass a realm
(or currently, a global object) trough layers upon layers of AOs!
Unlike the create() APIs we can safely assume that this is only ever
called when a running execution context and therefore current realm
exists. If not, you can always manually allocate the Error and put it in
a Completion :^)
In the spec, throw exceptions implicitly use the current realm's
intrinsics as well: https://tc39.es/ecma262/#sec-throw-an-exception
This is a continuation of the previous two commits.
As allocating a JS cell already primarily involves a realm instead of a
global object, and we'll need to pass one to the allocate() function
itself eventually (it's bridged via the global object right now), the
create() functions need to receive a realm as well.
The plan is for this to be the highest-level function that actually
receives a realm and passes it around, AOs on an even higher level will
use the "current realm" concept via VM::current_realm() as that's what
the spec assumes; passing around realms (or global objects, for that
matter) on higher AO levels is pointless and unlike for allocating
individual objects, which may happen outside of regular JS execution, we
don't need control over the specific realm that is being used there.
This is a continuation of the previous commit.
Calling initialize() is the first thing that's done after allocating a
cell on the JS heap - and in the common case of allocating an object,
that's where properties are assigned and intrinsics occasionally
accessed.
Since those are supposed to live on the realm eventually, this is
another step into that direction.
Using the fact that there are 2^52-2 NaN representations we can
"NaN-box" all the Values possible. This means that Value no longer has
an explicit "Type" but that information is now stored in the bits of a
double. This is done by "tagging" the top two bytes of the double.
For a full explanation see the large comment with asserts at the top of
Value.
We can also use the exact representation of the tags to make checking
properties like nullish, or is_cell quicker. But the largest gains are
in the fact that the size of a Value is now halved.
The SunSpider and other benchmarks have been ran to confirm that there
are no regressions in performance compared to the previous
implementation. The tests never performed worse and in some cases
performed better. But the biggest differences can be seen in memory
usage when large arrays are allocated. A simple test which allocates a
1000 arrays of size 100000 has roughly half the memory usage.
There is also space in the representations for future expansions such as
tuples and records.
To ensure that Values on the stack and registers are not lost during
garbage collection we also have to add a check to the Heap to check for
any of the cell tags and extracting the canonical form of the pointer
if it matches.
Note: MarkedVector is still relatively new and has zero users right now,
so these changes don't affect any code other than the class itself.
Reasons for this are the rather limited API:
- Despite the name and unlike MarkedValueList, MarkedVector isn't
actually a Vector, it *wraps* a Vector. This means that plenty of
convenient APIs are unavailable and have to be exported on the class
separately and forwarded to the internal Vector, or need to go through
the exposed Span - both not great options.
- Exposing append(Cell*) and prepend(Cell*) on the base class means that
it was possible to append any Cell type, not just T! All the strong
typing guarantees are basically gone, and MarkedVector doesn't do much
more than casting Cells to the appropriate type through the exposed
Span.
All of this combined means that MarkedVector - in its current form -
doesn't provide much value over MarkedValueList, and that we have to
maintain two separate, yet almost identical classes.
Let's fix this!
The updated MarkedVector steals various concepts from the existing
MarkedValueList, especially the ability to copy. On the other hand, it
remains generic enough to handle both Cell* and Value for T, making
MarkedValueList effectively redundant :^)
Additionally, by inheriting from Vector we get all the current and
future APIs without having to select and expose them separately.
MarkedVectorBase remains and takes care of communicating creation and
destruction of the class to the heap. Visiting the contained values is
handled via a pure virtual method gather_roots(), which is being called
by the Heap's function of the same name; much like the VM has one.
From there, values are added to the roots HashTable if they are cells
for T = Value, and unconditionally for any other T.
As a small additional improvement the template now also takes an
inline_capacity parameter, defaulting to 32, and forwards it to the
Vector template; allowing for possible future optimizations of current
uses of MarkedValueList, which hard-codes it to 32.
This feature had bitrotted somewhat and would trigger errors because
PrimitiveStrings were "destroyed" but because of this mode they were not
removed from the string cache. Even fixing that case running test-js
with the options still failed in more places.
Two of our most frequently allocated objects are Shape (88 bytes)
and DeclarativeEnvironment (80 bytes). Putting these into 128-byte
cells was quite wasteful, so let's add a more suitable allocator
for them.
This abstracts a vector of Cell* with a strongly typed span() accessor
that gives you Span<T*> instead of Span<Cell*>.
It is intended to replace MarkedValueList in situations where you only
need to store pointers to Cell (or an even more specific type of Cell).
The API can definitely be improved, it's just the bare basics for now.
This option is already enabled when building Lagom, so let's enable it
for the main build too. We will no longer be surprised by Lagom Clang
CI builds failing while everything compiles locally.
Furthermore, the stronger `-Wsuggest-override` warning is enabled in
this commit, which enforces the use of the `override` keyword in all
classes, not just those which already have some methods marked as
`override`. This works with both GCC and Clang.
WeakContainers need to look at the Cell::State bits to know if their
weak pointees got swept by garbage collection. So we must do this before
potentially freeing one or more HeapBlocks by notifying the allocator
that a block became empty.
Instead of iterating *all* swept cells when pruning weak containers,
only iterate the cells actually *in* the container.
Also, instead of compiling a list of all swept cells, we can simply
check the Cell::state() flag to know if something should be pruned.
VM now has a string cache which tracks all live PrimitiveStrings and
reuses an existing one if possible. This drastically reduces the number
of GC-allocated strings in many real-word situations.
This patch adds a `-z` option to js and test-js. When run in this mode,
garbage cells are never actually destroyed. We instead keep them around
in a special zombie state.
This allows us to validate that zombies don't get marked in future GC
scans (since there were not supposed to be any more references!) :^)
Cells get notified when they become a zombie (via did_become_zombie())
and this is used by WeakContainer cells to deregister themselves from
the heap.
Make this API take a Span<Cell*> instead of a Vector<Cell*>&.
This is behavior neutral, but stops the API looking like it wants to
do mutable things to the Vector.
This should fix the flaky tests of test-js.
It also fixes the tests when running with the -g flag since the values
will not be garbage collected too soon.
Making userspace provide a global string ID was silly, and made the API
extremely difficult to use correctly in a global profiling context.
Instead, simply make the kernel do the string ID allocation for us.
This also allows us to convert the string storage to a Vector in the
kernel (and an array in the JSON profile data.)