This is a continuation of the previous five commits.
A first big step into the direction of no longer having to pass a realm
(or currently, a global object) trough layers upon layers of AOs!
Unlike the create() APIs we can safely assume that this is only ever
called when a running execution context and therefore current realm
exists. If not, you can always manually allocate the Error and put it in
a Completion :^)
In the spec, throw exceptions implicitly use the current realm's
intrinsics as well: https://tc39.es/ecma262/#sec-throw-an-exception
This is a continuation of the previous three commits.
Now that create() receives the allocating realm, we can simply forward
that to allocate(), which accounts for the majority of these changes.
Additionally, we can get rid of the realm_from_global_object() in one
place, with one more remaining in VM::throw_completion().
This is a continuation of the previous two commits.
As allocating a JS cell already primarily involves a realm instead of a
global object, and we'll need to pass one to the allocate() function
itself eventually (it's bridged via the global object right now), the
create() functions need to receive a realm as well.
The plan is for this to be the highest-level function that actually
receives a realm and passes it around, AOs on an even higher level will
use the "current realm" concept via VM::current_realm() as that's what
the spec assumes; passing around realms (or global objects, for that
matter) on higher AO levels is pointless and unlike for allocating
individual objects, which may happen outside of regular JS execution, we
don't need control over the specific realm that is being used there.
This is a continuation of the previous commit.
Calling initialize() is the first thing that's done after allocating a
cell on the JS heap - and in the common case of allocating an object,
that's where properties are assigned and intrinsics occasionally
accessed.
Since those are supposed to live on the realm eventually, this is
another step into that direction.
No functional changes - we can still very easily get to the global
object via `Realm::global_object()`. This is in preparation of moving
the intrinsics to the realm and no longer having to pass a global
object when allocating any object.
In a few (now, and many more in subsequent commits) places we get a
realm using `GlobalObject::associated_realm()`, this is intended to be
temporary. For example, create() functions will later receive the same
treatment and are passed a realm instead of a global object.
The recently added generate-emoji-txt.sh script uses a Bash 4
substitution feature, causing CI to fail as macOS ships an ancient Bash
3.x. We'll want to use a more recent version anyway, so let's do that
instead of updading the script to older syntax.
The current emoji_txt.cmake does not handle download errors (which were
a common source of issues in the build problems channel) or Unicode
versioning. These are both handled by unicode_data.cmake. Move the
download to unicode_data.cmake so that we can more easily handle next
month's Unicode 15 release.
Instead of manually updating emoji.txt whenever new emoji are added,
we use Unicode's emoji-test.txt to generate emoji.txt on each build,
including only the emojis that Serenity supports at that time.
By using emoji-test.txt, we can also include all forms of each emoji
(fully-qualified, minimally-qualified, and unqualified) which can be
helpful when double-checking how certain forms are handled.
Verifies that emoji filenames:
- Contain only uppercase letters, numbers, +, and _
- Use _ and a separator between codepoints, not +
- Do not include the U+FE0F emoji presentation specifier
Similar to commit becec35, our code point display name data was a large
list of StringViews. RLE can be used here as well to remove about 32 MB
from the initialized data section to the read-only section.
Some of the refactoring to store strings as indices into an RLE array
also lets us clean up some of the code point name generators.
Currently, the unique string lists are stored in the initialized data
sections of their shared libraries. In order to move the data to the
read-only section, generate the strings using RLE arrays.
We generate two arrays: the first is the RLE data itself, the second is
a list of indices into the RLE array for each string. We then generate a
decoding method to convert an RLE string to a StringView.
We are downloading these directly into the build directory now, and
generating the source code from there, so we no longer need the
manually created directory.
While we are at it, remove two variables that seem to be no longer in
use, and at least one of which is confusing regarding a missing prefix.
This cache was disabled in 3127454 because it wasn't needed and there
was a race between the builders for this cache. Then commit 0c95d99
started fuzzing the generated Unicode / TZDB data. Since then, we've
been pulling this data from the live servers instead of Azure's cache.
We do a similar trick for the compiler cache. This allows each builder
to separately push their local data cache (if it changed) while pulling
a shared cache, without the race outlined in commit 3127454. This is
needed for a subsequent commit which will enable this cache for Fuzzer
builds.
This isn't called out in TR-35, but before ICU even looks at CLDR data,
it adds a hard-coded set of default patterns to each locale's calendar.
It has done this since 2006 when its DateTimeFormat feature was first
created. Several test262 tests depend on this, which under ECMA-402,
falls into "implementation defined" behavior. For compatibility, we
can do the same in LibUnicode.
In the generated unique string list, index 0 is the empty string, and is
used to indicate a value doesn't exist in the CLDR. Check for this
before returning an empty calendar symbol.
For example, an upcoming commit will add the fixed day period "noon",
which not all locales support.
Commit ec7d535 only partially handled the case of flexible day periods
rolling over midnight, in that it only worked for hours after midnight.
For example, the en locale defines a day period range of [21:00, 06:00).
The previous method of adding 24 hours to the given hour would change
e.g. 23:00 to 47:00, which isn't valid.
This abstraction layer is mainly for ATA ports (AHCI ports, IDE ports).
The goal is to create a convenient and flexible framework so it's
possible to expand to support other types of controller (e.g. Intel PIIX
and ICH IDE controllers) and to abstract operations that are possible on
each component.
Currently only the ATA IDE code is affected by this, making it much
cleaner and readable - the ATA bus mastering code is moved to the
ATAPort code so more implementations in the near future can take
advantage of such functionality easily.
In addition to that, the hierarchy of the ATA IDE code resembles more of
the SATA AHCI code now, which means the IDEChannel class is solely
responsible for getting interrupts, passing them for further processing
in the ATAPort code to take care of the rest of the handling logic.
This is a start to properly letting us cross-compile Lagom where both
the Tools and the BUILD_LAGOM=ON build are using Lagom CMakeLists.
The initial cut allows an Android build to succeed, more or less.
But there are issues with namespace clashes when using FetchContent with
this approach.
When patterns, grouping digits, symbols, etc. for a requested numbering
system are not found, use the locale's default numbering system. This
will allow using the correct digits e.g. for the locale "en-u-nu-arab"
even though the "en" locale only contains patterns for the "latn"
numbering system.