This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
This is similar to "unique" shapes, which were removed in commit
3d92c26445.
The key difference is that dictionary shapes don't have a serial number,
but instead have a "cacheable" flag.
Shapes become dictionaries after 64 transitions have occurred, at which
point no further transitions occur.
As long as properties are only added to a dictionary shape, it remains
cacheable. (Since if we've cached the shape pointer in an IC somewhere,
we know the IC is still valid.)
Deleting a property from a dictionary shape causes it to become an
uncacheable dictionary.
Note that deleting a property from a non-dictionary shape still performs
a delete transition.
This fixes an issue on Discord where Object.freeze() would eventually
OOM us, since they add more than 16000 properties to a single object
before freezing it.
It also yields a 15% speedup on Octane/pdfjs.js :^)
We previously had a concept of unique shapes, which meant that they
couldn't be shared between multiple objects.
Object shapes became unique in three situations:
- They were the shape of the global object.
- They had more than 100 properties added to them.
- They had one or more properties deleted from them.
Unfortunately, unique shapes presented an annoying problem for inline
caches, and we added a "unique shape serial number" for being able to
tell that a unique shape had been mutated.
This patch gets rid of the concept of unique shapes, simplifying all
the caching code, since inline caches can now simply perform a shape
check and then we're good.
To make this possible, we now have the concept of delete transitions,
which occur when a property is deleted from a shape.
Note that this patch by itself introduces a performance regression in
some situtations, since we now create a lot more shapes, and marking
their property keys can be very heavy. This will be addressed in a
subsequent patch.
Instead of going through the steps of creating an empty new object,
and adding two properties ("value" and "done") to it, we can pre-bake
a shape object and cache the property offsets.
This makes creating iterator result objects in the runtime much faster.
47% speedup on this microbenchmark:
function go(a) {
for (const p of a) {
}
}
const a = [];
a.length = 1_000_000;
go(a);
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
This patch makes it possible for JS::Object::internal_set() to populate
a CacheablePropertyMetadata, and uses this to implement a basic
monomorphic cache for the most common form of property write access.
There is not need to use SafeFunction because
define_native_function or define_native_accessor will pass callback
forward to NativeFunction that uses HeapFunction to visit it.
This allows us to get rid of property_table_ordered() which was a
heavy-handed way of iterating properties in insertion order by first
copying them to a sorted Vector.
Clients can now simply iterate property_table() directly.
3% speed-up on Kraken/ai-astar.js :^)
Stop worrying about tiny OOMs. Work towards #20449.
While going through these, I also changed the function signature in many
places where returning ThrowCompletionOr<T> is no longer necessary.
This function now takes an optional out parameter for callers who would
like to what kind of property we ended up getting.
This will be used to implement inline caching for property lookups.
Also, to prepare for adding more forms of caching, the out parameter
is a struct CacheablePropertyMetadata rather than just an offset. :^)
Most JS::Objects don't have lazily-allocated intrinsic properties,
so let's avoid doing hash lookups by putting a flag on JS::Object that
tells us whether it's present in s_intrinsics.
Takes CPU time spent in those hash lookups from 1-2.5% to nothing on
various JS heavy pages.
This fixes an issue where private element values were not always
protected from GC. I found two instances where this was happening:
- ECMAScriptFunctionObject did not mark m_private_methods
- ClassDefinitionEvaluation had two Vector<PrivateElement> that were
opaque to the garbage collector, and so if GC occurred while
constructing a class instance, some or all of its private elements
could get incorrectly collected.
Some of these are allocated upon initialization of the intrinsics, and
some lazily, but in neither case the getters actually return a nullptr.
This saves us a whole bunch of pointer dereferences (as NonnullGCPtr has
an `operator T&()`), and also has the interesting side effect of forcing
us to explicitly use the FunctionObject& overload of call(), as passing
a NonnullGCPtr is ambigous - it could implicitly be turned into a Value
_or_ a FunctionObject& (so we have to dereference manually).
This doesn't return a completion in the spec as it doesn't need to
propagate any errors. It's also unused right now, which is probably why
no one noticed.
The iterator used to find an intrinsic accessor is used after calling
`HashMap.remove()` on it, which works for our current implementation but
will fall apart when you consider that modifications to the hash map
might invalidate all existing iterators that came from it, as many
implementations do.
Since we're aiming to replace our `HashTable` implementation with
something new, let's fix this first :^)
Note that as of this commit, there aren't any such throwers, and the
call site in Heap::allocate will drop exceptions on the floor. This
commit only serves to change the declaration of the overrides, make sure
they return an empty value, and to propagate OOM errors frm their base
initialize invocations.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
This makes construction of Utf16String fallible in OOM conditions. The
immediate impact is that PrimitiveString must then be fallible as well,
as it may either transcode UTF-8 to UTF-16, or create a UTF-16 string
from ropes.
There are a couple of places where it is very non-trivial to propagate
the error further. A FIXME has been added to those locations.
This constructor was easily confused with a copy constructor, and it was
possible to accidentally copy-construct Objects in at least one way that
we dicovered (via generic ThrowCompletionOr construction).
This patch adds a mandatory ConstructWithPrototypeTag parameter to the
constructor to disambiguate it.
Note that js_rope_string() has been folded into this, the old name was
misleading - it would not always create a rope string, only if both
sides are not empty strings. Use a three-argument create() overload
instead.
This makes more sense as an Object method rather than living within the
VM class for no good reason. Most of the other 7.3.xx AOs already work
the same way.
Also add spec comments while we're here.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
For performance, it is desirable to defer evaluation of intrinsics that
are stored on the GlobalObject for every created Realm. To support this,
Object now maintains a global storage map to store lambdas that will
return the associated intrinsic when evaluated. Once accessed, the
instrinsic is moved from this global map to normal Object storage.
To prevent this flow from becoming observable, when a deferred intrinsic
is stored, we still place an empty object in the normal Object storage.
This is so we still create the metadata for the object, and in doing so,
can preserve insertion order of the Object storage. Otherwise, this will
be observable by way of Object.getOwnPropertyDescriptors.
We were taking AK::Function and then passing them along to
NativeFunction, which takes a SafeFunction. This works, since
SafeFunction will transparently wrap AK::Function in a CallableWrapper
when assigned, but it was causing us to accumulate thousands of
pointless wrappers around direct function pointers.
By using SafeFunction at every step of the setup call chain, we no
longer create any CallableWrappers for the majority of native functions
in LibJS. Also, the number of heap-registered SafeFunctions in a new
realm goes down from ~5000 to 5. :^)
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
- Prefer VM::current_realm() over GlobalObject::associated_realm()
- Prefer VM::heap() over GlobalObject::heap()
- Prefer Cell::vm() over Cell::global_object()
- Prefer Wrapper::vm() over Wrapper::global_object()
- Inline Realm::global_object() calls used to access intrinsics as they
will later perform a direct lookup without going through the global
object
This is needed so that the allocated NativeFunction receives the correct
realm, usually forwarded from the Object's initialize() function, rather
than using the current realm.
This is a continuation of the previous five commits.
A first big step into the direction of no longer having to pass a realm
(or currently, a global object) trough layers upon layers of AOs!
Unlike the create() APIs we can safely assume that this is only ever
called when a running execution context and therefore current realm
exists. If not, you can always manually allocate the Error and put it in
a Completion :^)
In the spec, throw exceptions implicitly use the current realm's
intrinsics as well: https://tc39.es/ecma262/#sec-throw-an-exception
This is a continuation of the previous three commits.
Now that create() receives the allocating realm, we can simply forward
that to allocate(), which accounts for the majority of these changes.
Additionally, we can get rid of the realm_from_global_object() in one
place, with one more remaining in VM::throw_completion().