These work differently from how we validate StyleValues. There, we parse
a StyleValue from the CSS, and then see if it is allowed in the
property. That causes problems when the syntax is ambiguous - for
example, `0` can be a number or a Length.
Here instead, we ask what kinds of value are allowed for a
media-feature, and then only attempt to parse those kinds of value.
This makes the ambiguity problem go away. :^)
Each media-feature in the spec only accepts one type of value, and/or
some identifiers. This makes the switch statements for the type a bit
excessive, but the spec does not *require* that only one type is
allowed, so this is more future-proof.
This works largely the same as the PropertyID and ValueID generators,
but using LibMain, Core::Stream, and TRY().
Rather than have a MediaFeatureID::Invalid, I decided to return an
Optional. We'll see if that turns out better or not. :^)
This patch adds NodeIterator (created via Document.createNodeIterator())
which allows you to iterate through all the nodes in a subtree while
filtering with a provided NodeFilter callback along the way.
This first cut implements the full API, but does not yet handle nodes
being removed from the document while referenced by the iterator. That
will be done in a subsequent patch.
This initial version lays down the basic foundation of IDL overload
resolution, but much of it will have to be replaced with the actual IDL
overload resolution algorithms once we start implementing more complex
IDL overloading scenarios.
This allows us to fuzz the generated unicode and timezone database
helpers, and to fuzz things like LibJS using Fuzzilli to get proper
coverage of our unicode handling code.
Update the Azure CI to use the new two-stage build as well, and cleanup
some unused CMake options there.
WebSockets got moved from the HTML standard to their own, the new
WebSockets Standard (https://websockets.spec.whatwg.org).
Move the IDL file and implementation into a new WebSockets directory and
C++ namespace accordingly.
The single 4000-line WrapperGenerator.cpp file was proving to be a pain
to hack, and was filled with spaghetti, split it into a bunch of files
to lessen the impact of the spaghetti.
Also refactor the whole parser to use a class instead of a giant
function with a million lambdas.
I've attempted to handle the errors gracefully where it was clear how to
do so, and simple, but a lot of this was just adding
`release_value_but_fixme_should_propagate_errors()` in places.
I can't imagine how this happened, but it seems we've managed to
conflate the "event listener" and "EventListener" concepts from the DOM
specification in some parts of the code.
We previously had two things:
- DOM::EventListener
- DOM::EventTarget::EventListenerRegistration
DOM::EventListener was roughly the "EventListener" IDL type,
and DOM::EventTarget::EventListenerRegistration was roughly the "event
listener" concept. However, they were used interchangeably (and
incorrectly!) in many places.
After this patch, we now have:
- DOM::IDLEventListener
- DOM::DOMEventListener
DOM::IDLEventListener is the "EventListener" IDL type,
and DOM::DOMEventListener is the "event listener" concept.
This patch also updates the addEventListener() and removeEventListener()
functions to follow the spec more closely, along with the "inner invoke"
function in our EventDispatcher.
We no longer include all the things, so each generated IDL file only
depends on the things it actually needs now.
A possible downside is that all IDL files have to explicitly import
their dependencies.
Note that non-IDL dependencies still remain and are injected into all
generated files, this can be resolved later if desired by allowing IDL
files to import headers.
BCP 47 will be the single source of truth for known calendar and number
system keywords, and their aliases (e.g. "gregory" is an alias for
"gregorian"). Move the generation of available keywords to where we
parse the BCP 47 data, so that hard-coded aliases may be removed from
other generators.
We have a fair amount of hard-coded keywords / aliases that can now be
replaced with real data from BCP 47. As a result, the also changes the
awkward way we were previously generating keys. Before, we were more or
less generating keywords as a CSV list of keys, e.g. for the "nu" key,
we'd generate "latn,arab,grek" (ordered by locale preference). Then at
runtime, we'd split on the comma. We now just generate spans of keywords
directly.
This package was originally meant to be included in CLDR version 40, but
was missed in their release scripts. This has been resolved:
https://unicode-org.atlassian.net/browse/CLDR-15158
Unfortunately, the CLDR was re-released with the same version number. So
to bust the build's CLDR cache, change the "version" used to detect that
we need to redownload the CLDR.
This is no longer needed as BrowsingContextContainer::content_document()
now does the right thing, and HTMLIFrameElement.contentDocument is the
only user of this attribute. Let's not invent our own mechanisms for
things that are important to get right, like same origin comparisons.
The spec version of canonical_numeric_index_string is absurdly complex,
and ends up converting from a string to a number, and then back again
which is both slow and also requires a few allocations and a string
compare.
Instead this patch moves away from using Values to represent canonical
a canonical index. In most cases all we need to know is whether a
PropertyKey is an integer between 0 and 2^^32-2, which we already
compute when we construct a PropertyKey so the existing is_number()
check is sufficient.
The more expensive case is handling strings containing numbers that
don't roundtrip through string conversion. In most cases these turn
into regular string properties, but for TypedArray access these
property names are not treated as normal named properties.
TypedArrays treat these numeric properties as magic indexes that are
ignored on read and are not stored (but are evaluated) on assignment.
For that reason there's now a mode flag on canonical_numeric_index_string
so that only TypedArrays take the cost of the ToString round trip test.
In order to improve the performance of this path this patch includes
some early returns to avoid conversion in cases where we can quickly
know whether a property can round trip.
This adds a generator utility to read an entire file and parse it as a
JSON value. This is heavily used by the CLDR generators. The idea here
is to put the file reading details in the utility so that when we have a
good story for generically reading an entire stream in LibCore, we can
update the generators to use that by only touching this helper.
This also moves the open_file helper to the utility file. It's currently
a lambda redefined in each TZDB/Unicode generator. It used to display
the missing command line flag and other info local to each generator.
After switching to LibMain, it just returns a generic error message, and
is duplicated several times.
This reverts commit 3a184f7841.
This broke a number of test262 tests under "TypedArrayConstructors".
The issue is that the CanonicalNumericIndexString AO should not fail
for inputs like "1.1", despite them not being integral indices.
The spec version of canonical_numeric_index_string is absurdly complex,
and ends up converting from a string to a number, and then back again
which is both slow and also requires a few allocations and a string
compare.
Instead lets use the logic we already have as that is much more
efficient.
This improves performance of all non-numeric property names.
This initial implementation stubs out the WorkerGlobalScope,
WorkerLocation and WorkerNavigator classes. It doesn't take into account
all the things that actually need passed into the constructors for these
objects, nor the extra abstract operations that need to be performed on
them by the rest of the Browser infrastructure. However, it does create
bindings that compile and link :^)
This adds a CPack configuration to generate a release package for js(1).
Our current CMake requirement is 3.16, which doesn't have a great story
for automatically installing a binary target's library dependencies. If
we eventually require CMake 3.21 or above, we can remove the helper
.cmake file added here in lieu of RUNTIME_DEPENDENCIES.
This isn't perfect (especially the global object situation in
activate_event_handler), but I believe it's in a much more complete
state now :^)
This fixes the issue of crashing in prepare_for_ordinary_call with the
`i < m_size` crash, as it now uses the IDL callback functions which
requires the Environment Settings Object. The environment settings
object for the callback is fetched at the time the callback is created,
for example, WrapperGenerator gets the incumbent settings object for
the callback at the time of wrapping. This allows us to remove passing
in ScriptExecutionContext into EventTarget's constructor.
With this, we can now drop ScriptExecutionContext.
Since VM::exception() no longer exists this is now useless. All of these
calls to clear_exception were just to clear the VM state after some
(potentially) failed evaluation and did not use the exception itself.
Prefixes are very much a C thing which we don't need in C++. This commit
moves all GML-related classes in LibGUI into the GUI::GML namespace, a
change somewhat overdue.
The CMakeLists.txt for Lagom contains a few libraries and executables
with X86-specific code. By excluding those libraries, Lagom builds
for macOS on Arm as well. The places are marked FIXME to be removed
when the libraries will build for Arm.
Added the call to generate_available_values(), then realized it is the
exact same as the existing, manually written implementation. So let's
use the new utility.
Unlike other BCP47 keywords that we are parsing, these only appear in
the BCP47 XML file itself within the CLDR. The values are very simple
though, so just hard code them until the Unicode org re-releases the
CLDR with BCP47: https://unicode-org.atlassian.net/browse/CLDR-15158
Previously, given a malformed IPC call declaration, where a parameter
does not have a name, the IPCCompiler would spin endlessly while
consuming more and more memory.
This is because it parses the parameter type incorrectly
(it consumes superfluous characters into the parameter type).
An example for such malformed declaration is:
tokens_info_result(Vector<GUI::AutocompleteProvider::TokenInfo>) =|
As a temporary fix, this adds VERIFY calls that would fail if we're at
EOF when parsing parameter names.
A real solution would be to parse C++ parameter types correctly.
LibCpp's Parser could be used for this task.
Relative-time format patterns are of one of two forms:
* Tensed - refer to the past or the future, e.g. "N years ago" or
"in N years".
* Numbered - refer to a specific numeric value, e.g. "in 1 year"
becomes "next year" and "in 0 years" becomes "this year".
In ECMA-402, tensed and numbered refer to the numeric formatting options
of "always" and "auto", respectively.
This sets up the generator plumbing to create the relative-time data
files. This data could probably be included in the date-time generator,
but that generator is large enough that I'd rather put this tangentially
related data in its own file.
Previously, we were breaking up digits into groups without regard for
the locale's minimumGroupingDigits value in the CLDR. This value is 1 in
most locales, but is 2 in locales such as pl-PL. What this means is that
in those locales, the group separator should only be inserted if the
thousands group has at least 2 digits. So 1000 is formatted as "1,000"
in en-US, but "1000" in pl-PL. And 10000 is "10,000" in en-US and
"10 000" in pl-PL.
These tests are not meant as a replacement to test-js with the -b option
but are meant to test simple cases until that works.
Before this it was very easy to accidentally break bytecode since no
tests were run in bytecode mode. This hopefully makes it easier to spot
such regressions :^).
This just splits up the method to find the active DST rule for specified
time and time zone. This is to allow re-using the now split-off function
in upcoming commits.
For example, today, America/New_York has the format string "E%sT" and
uses US DST rules. Those rules indicate the %s should be replaced by a
"D" in daylight time and "S" in standard time.
Before this commit all consume_until overloads aside from the Predicate
one would consume (and ignore) the stop char/string, while the
Predicate overload would not, in order to keep behaviour consistent,
the other overloads no longer consume the stop char/string as well.
This downloads the UEFI's published PNP ID database and generates a
lookup table for use in LibEDID. The lookup table isn't optimized at
all, but this can be easily done at a later point if needed.
This also refactors interpreter creation to follow
InitializeHostDefinedRealm, but I couldn't fit it in the title :^)
This allows us to follow the spec much more closely rather than being
completely ad-hoc with just the parse node instead of having all the
surrounding data such as the realm of the parse node.
The interpreter creation refactor creates the global execution context
once and doesn't take it off the stack. This allows LibWeb to take the
global execution context and manually handle it, following the HTML
spec. The HTML spec calls this the "realm execution context" of the
environment settings object.
It also allows us to specify the globalThis type, as it can be
different from the global object type. For example, on the web, Window
global objects use a WindowProxy global this value to enforce the same
origin policy on operations like [[GetOwnProperty]].
Finally, it allows us to directly call Program::execute in perform_eval
and perform_shadow_realm_eval as this moves
global_declaration_instantiation into Interpreter::run
(ScriptEvaluation) as per the spec.
Note that this doesn't evalulate Source Text Modules yet or refactor
the bytecode interpreter, that's work for future us :^)
This patch was originally build by Luke for the environment settings
object change but was also needed for modules. So I (davidot) have
modified it with the new completion changes and setup for that.
Co-authored-by: davidot <davidot@serenityos.org>
Length and Percentage are different types, and sometimes only one or the
other is allowed in a given CSS property. This is a first step towards
separating them.
Each ZONE entry contains a RULES segment with one of the following:
* A DST rule name, which links the ZONE to a RULE entry holding the
DST rules to apply.
* A static offset to be applied to the STDOFF offset. This implicitly
means that the time zone is in DST during that time frame.
* A "-" string, meaning no offset is applied to the STDOFF offset, and
the time zone is in standard time during that time frame.
This change unfortunately cannot be atomically made without a single
commit changing everything.
Most of the important changes are in LibIPC/Connection.cpp,
LibIPC/ServerConnection.cpp and LibCore/LocalServer.cpp.
The notable changes are:
- IPCCompiler now generates the decode and decode_message functions such
that they take a Core::Stream::LocalSocket instead of the socket fd.
- IPC::Decoder now uses the receive_fd method of LocalSocket instead of
doing system calls directly on the fd.
- IPC::ConnectionBase and related classes now use the Stream API
functions.
- IPC::ServerConnection no longer constructs the socket itself; instead,
a convenience macro, IPC_CLIENT_CONNECTION, is used in place of
C_OBJECT and will generate a static try_create factory function for
the ServerConnection subclass. The subclass is now responsible for
passing the socket constructed in this function to its
ServerConnection base; the socket is passed as the first argument to
the constructor (as a NonnullOwnPtr<Core::Stream::LocalServer>) before
any other arguments.
- The functionality regarding taking over sockets from SystemServer has
been moved to LibIPC/SystemServerTakeover.cpp. The Core::LocalSocket
implementation of this functionality hasn't been deleted due to my
intention of removing this class in the near future and to reduce
noise on this (already quite noisy) PR.
Currently, the UnicodeLocale generator collects a list of known locales
from the CLDR before processing language display names. For each locale,
the identifier is broken into language, script, and region subtags, and
we create a list of seen languages. When processing display names, we
skip languages we hadn't seen in that first step.
This is insufficient for language display names like "en-GB", which do
not have an locale entry in the CLDR, and thus are skipped. So instead,
create the list of known languages by actually reading through the list
of languages which have a display name.
These patterns indicate how to display locale strings when that locale
contains multiple subtags. For example, "en-US" would be displayed as
"English (United States)".
Note there's a bit of an unfortunate duplication in the calendar enum
generated by UnicodeLocale and the existing enum generated by
UnicodeDateTimeFormat. The former contains every calendar known to the
CLDR, whereas the latter contains the calendars we've actually parsed
for DateTimeFormat (currently only Gregorian). The new enum generated
here can be removed once DateTimeFormat knows about all calendars.
Our generator is currently preferring the DST variant of the time zone
display names over the non-DST variant. LibTimeZone currently does not
have DST support, and operates in a mode that basically assumes DST does
not exist. Swap the display names for now just to be consistent until we
have DST support.
Note we will need to generate both of these variants and select the
appropriate one at runtime once we have DST support.
Now that number systems are generated as an enum, we can generated the
number system data in the order of that enum. This lets us perform
lookups of that data by index instead of a loop of string comparisons.
We had a hard-coded table of number system digits copied from ECMA-402.
Turns out these digits are in the CLDR, so let's parse the digits from
there instead of hard-coding them.
This adds an API to use LibTimeZone to convert a time zone such as
"America/New_York" to a GMT offset string like "GMT-5" (short form) or
"GMT-05:00" (long form).
This is a rather naive implementation, but serves as a first pass at
determining the GMT offset for a time zone at a particular point in
time. This implementation ignores DST (because we are not parsing any
RULE entries yet), and ignores any offset patterns of the form "Mon>4"
or "lastSun".
For example, generate "Etc/GMT+12" as "Etc_GMT_Ahead_12" (instead of as
"Etc_GMT_P12"). A little clearer what the name means without having to
know off-hand what "P" was representing.
The generate_mapping helper generates a series of structs like:
Array<SomeType, 1> s_mapping_key_0 {};
Array<SomeType, 2> s_mapping_key_1 {};
Array<SomeType, 3> s_mapping_key_2 {};
Array<Span<SomeType const>> s_mapping { {
s_mapping_key_0.span(),
s_mapping_key_1.span(),
s_mapping_key_2.span(),
} };
Where the names of the struct were generated by the format_mapping_name
lambda inside the helper. Rather than this lambda making assumptions on
how each generator wants to name its structs, add a parameter for the
caller to provide a naming formatter.
This is because the TimeZoneData generator will want pretty specific
identifier formatting rules.
When compiled using clang, an ambiguity error is detected between
`class AK::Time` aliased to `::Time` and the `struct ::Time` provided
in `GenerateTimeZoneData.cpp`. Solve this by moving most of the code in
an anonymous namespace.
Instead of making it a void function, checking for an exception, and
then receiving the relevant result via VM::last_value(), we can
consolidate all of this by using completions.
This allows us to remove more uses of VM::exception(), and all uses of
VM::last_value().
LibUnicode no longer needs to generate a list of time zone names that it
parsed from metaZones.json. We can defer to the TZDB for a golden list
of time zones.
The IANA Time Zone Database contains data needed, at least, for various
JavaScript objects. This adds plumbing for a parser and code generator
for this data. The generated data will be made available by LibTimeZone,
much like how UCD and CLDR data is available through LibUnicode.
The generator parses metaZones.json to form a mapping of meta zones to
time zones (AKA "golden zone" in TR-35). This parser errantly assumed
this was a 1-to-1 mapping.
In Unicode::get_time_zone_name(), we don't need to require that the time
zone is UTC for long- and short-style name lookups. This is required for
other styles, because they will depend on TZDB data - so move the VERIFY
to that scope.
When searching for the locale-specific flexible day period for a given
hour, we were neglecting to handle cases where the period crosses 00:00.
For example, the en locale defines a day period range of [21:00, 06:00).
When given the hour of 05:00, we were checking if (21 <= 5 && 5 < 6),
thus not recognizing that the hour falls in that period.
This is a temporary mechanism while LibUnicode is in an in-between state
where some symbols are weakly linked and others are dynamically loaded.
The latter require an asm() label to be loaded.
Currently, we load the generated Unicode symbols with dlopen at runtime.
This is unnecessary as of 565a880ce5.
Applications that want Unicode data now link directly against the shared
library holding that data. So the same functionality can be achieved
with weak symbols.
This requires an implementation of the "text preparation algorithm" as
specified here:
html.spec.whatwg.org/multipage/canvas.html#text-preparation-algorithm
However, we're missing a lot of things such as the
CanvasTextDrawingStyles interface, so most of the algorithm was not
implemented. Additionally, we also are not able to use a LineBox like
the algorithm suggests, because our layouting infra is not up to the
task yet. The prepare_text function does nothing other than figuring out
the width of the given text and return glyphs with offsets at the
moment.
ECMA-402 now supports short-offset, long-offset, short-generic, and
long-generic time zone name formatting. For example, in the en-US locale
the America/Eastern time zone would be formatted as:
short-offset: GMT-5
long-offset: GMT-05:00
short-generic: ET
long-generic: Eastern Time
We currently only support the UTC time zone, however. Therefore, this
very minimal implementation does not consider GMT offset or generic
display names. Instead, the CLDR defines specific strings for UTC.
OpenBSD gzip does not have the -k flag to keep the original after
extraction. Work around this by copying the original gzip to the dest
and then extracting. A bit of a hack, but only needs to be done for the
first-time or rebuilds
OpenBSD provides crypt in libc, not libcrypt. Adjust if/else to check
for either and proceed accordingly
Remove outdated OpenBSD checks when building the toolchain
This introduces a new library, LibSoftGPU, that incorporates all
rendering related features that formerly resided within LibGL itself.
Going forward we will make both libraries completely independent from
each other allowing LibGL to load different, possibly accelerated,
rendering backends.
PVS Studio static analysis noticed we didn't initialize these in a
bunch of cases. This change fixes that so we will always initialize
these using universal initialization.
The generated data for libunicodedata.so is quite large, and loading it
is a price paid by nearly every application by way of depending on
LibRegex. In order to defer this cost until an application actually uses
one of the surrounding APIs, dynamically load the generated symbols.
To be able to load the symbols dynamically, the generated methods must
have demangled names. Typically, this is accomplished with `extern "C"`
blocks. The clang toolchain complains about this here because the types
returned from the generators are strictly C++ types. So to demangle the
names, we use the asm() compiler directive to manually define a symbol
name; the caveat is that we *must* be sure the symbols are unique. As an
extra precaution, we prefix each symbol name with "unicode_". For more
details, see: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html
This symbol loader used in this implementation provides the additional
benefit of removing many [[maybe_unused]] attributes from the LibUnicode
methods. Internally, if ENABLE_UNICODE_DATABASE_DOWNLOAD is OFF, the
loader is able to stub out the function pointers it returns.
Note that as of this commit, LibUnicode is still directly linked against
LibUnicodeData. This commit is just a first step towards removing that.
The variable `s_time_zone_list_index_type` seems to be unused (detected
when compiling with clang), and it seems logical to bind it even it if
it is not used for now.
So far the working directory was set in some cases using
`set_tests_properties(...)`, but this requires to know which name is
picked by `lagom_test(...)` when calling `add_test(...)`.
In case of adding multiple test cases using a globbing pattern this
would require to duplicate code to construct the test name from the file
name.
Just some boilerplate code to get started :^)
This adds both the SubtleCrypto constructor to the window object, as
well as the crypto.subtle instance attribute.
Similar to commit 2a7f36b392, this change moves the generated
CalendarSymbol enumeration to the public LibUnicode/NumberFormat.h
header with a pre-defined set of symbols that we need. This is to
prepare for uniquely generating the CalendarSymbols structure.
Each of the 374 generated calendars include 4 sets of symbols, each of
which have 3 lists of symbols (narrow, short, long). Of these 4488
lists, only 819 are unique.
This option is already enabled when building Lagom, so let's enable it
for the main build too. We will no longer be surprised by Lagom Clang
CI builds failing while everything compiles locally.
Furthermore, the stronger `-Wsuggest-override` warning is enabled in
this commit, which enforces the use of the `override` keyword in all
classes, not just those which already have some methods marked as
`override`. This works with both GCC and Clang.
There are 443 number system objects generated, each of which held an
array of number system symbols. Of those 443 arrays, only 39 are unique.
To uniquely store these, this change moves the generated NumericSymbol
enumeration to the public LibUnicode/NumberFormat.h header with a pre-
defined set of symbols that we need. This is to ensure the generated,
unique arrays are created in a known order with known symbols. While it
is unfortunate to no longer discover these symbols at generation time,
it does allow us to ignore unwanted symbols and perform less string-to-
enumeration conversions at lookup time.
The evolution of UniqueStorage has been as follows:
1. It was created as UniqueStringStorage to ensure only one copy of each
unique string is generated. Interested parties stored an index into
a unique string list, rather than the string itself.
Commits: f9e605397c and 04e6b43f05
2. It became apparent that non-string structures could also be de-
duplicated to reduce the size of libunicode.so. UniqueStringStorage
was generalized to UniqueStorage for this purpose.
Commit: d8e6beb14f
It's now also apparent that there's heavy duplication of lists of
structures. For example, the NumberFormat generator stores 4 lists of
NumberFormat objects. In total, we currently generate nearly 2,000 lists
of these objects, of which 275 are unique.
This change updates UniqueStorage to support storing lists. The only
change is how the storage is generated - we generate each stored list
individually, then an array storing spans of those lists.
In the CLDR, there aren't "night" values, there are "night1" & "night2"
values. This is for locales which use a different name for nighttime
depending on the hour. For example, the ja locale uses "夜" between the
hours of 19:00 and 23:00, and "夜中" between the hours of 23:00 and
04:00. Our CLDR parser is currently ignoring "night2", so this rename
is to prepare for that.
We could probably come up with better names, but in the end, the API in
LibUnicode will be such that outside callers won't even see Night1, etc.
Pattern skeletons are more or less the "key" of format patterns. Every
format pattern is assigned a skeleton. Interval patterns (which are not
yet parsed) are also assigned a skeleton - this is used to match them to
an "owning" format pattern. So we will use the skeleton generated here
to match format patterns at runtime with their available interval
patterns.
An alternative approach would be to append interval patterns directly to
their owning format pattern, but this has some draw backs:
1. Skeletons aren't totally unique. A skeleton may appear in both
the "dateFormats" and "availableFormats" objects, in which case
the same interval formats would be generated more than once.
2. Otherwise unique format patterns may only differ by the interval
patterns assigned to them. This would cause the UniqueStorage for
the format patterns to increase in size, impacting both compile
times and libunicode.so size.
The parsing in parse_calendar_symbols() might be a bit more verbose than
it really needs to be, but it is to ensure the symbols are generated in
a known order that we can control with enumerations.
TR-35's Matching Skeleton algorithm dictates how user requests including
fractional second digits should be handled when the CLDR format pattern
does not include that field. When the format pattern contains {second},
but does not contain {fractionalSecondDigits}, generate a second pattern
which appends "{decimal}{fractionalSecondDigits}" to the {second} field.
TR-35 does define lengths for {ampm}, but they are unused by ECMA-402.
To the contrary, defining the day_period length for this segment will
prevent BasicFormatMatcher from ever selecting a pattern that contains
this segment. Instead, ECMA-402 will only use the short length for
{ampm} segments.
TR-35 describes how to combine date, time, and available formats with
date-time format patterns to generate more available format patterns:
https://unicode.org/reports/tr35/tr35-dates.html#Missing_Skeleton_Fields
Use these steps to generate ~400 new patterns for each calendar. These
are required for ECMA-402's BasicFormatMatcher to produce reasonable
results.
Similar to NumberFormat, replace the segments of date-time patterns with
partitions that can be split at runtime. Also generate the pattern style
fields for e.g. era, day, hour, etc.
Add unique storage for parsed CalendarPattern structures to ensure only
one copy of each structure is generated.
This doesn't have any impact on libunicode.so with the current generated
data. Rather, this prevents the amount of generated data from needlessly
growing astronomically once date-time patterns are fully parsed. There
will be 173,459 patterns parsed, of which only 22,495 (about 12%) are
unique. This change will save a few MB, and will also help compilation
times.
Currently, there's only a handful of entries in these arrays, so it is
not a huge deal to generate them inline with the struct that holds them.
But they will each soon contain a few hundred entries. Generate them out
of line for easier viewing in the generated code.
Add unique storage for parsed NumberFormat structures to ensure only one
copy of each structure is generated. Reduces libunicode.so on x86 from
13.2 MB to 11.4 MB.
UniqueStringStorage is used to ensure only one copy of a string will be
generated, and interested parties store just an index into the generated
storage. Generalize this class to allow any* type to be stored uniquely.
* To actually be storable, the type must have both an AK::Format and an
AK::Traits overload available.
The synchronous call returns a NonnullOwnPtr that we don't use, so we
have to cast to prevent a compiler warning once smart pointers become
[[nodiscard]].
This is not a calendar supported by ECMA-402, so let's not waste space
with its data.
Further, don't generate "gregorian" as a valid Unicode locale extension
keyword. It's an invalid type identifier, thus cannot be used in locales
such as "en-u-ca-gregorian".
For example, consider the following adjacent entries in UnicodeData.txt:
3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;;
4DBF;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;;
Our current implementation would assign the display name "CJK Ideograph
Extension A" to code points U+3400 & U+4DBF, but not to the code points
in between. Not only should those code points be assigned a name, but
the Unicode spec also has formatting rules on what the names should be
(the names for these ranged code points are not as they appear in
UnicodeData.txt).
The spec also defines names for code point ranges that actually are
listed individually in UnicodeData.txt. For example:
2F800;CJK COMPATIBILITY IDEOGRAPH-2F800;Lo;0;L;4E3D;;;;N;;;;;
2F801;CJK COMPATIBILITY IDEOGRAPH-2F801;Lo;0;L;4E38;;;;N;;;;;
2F802;CJK COMPATIBILITY IDEOGRAPH-2F802;Lo;0;L;4E41;;;;N;;;;;
Code points are only coalesced into a range if all fields after the name
are equivalent. Our parser will insert the range and its name formatting
pattern when it comes across the first code point in that range, then
ignore other code points in that range. This reduces the number of names
we generated by nearly 2,000.
Unlike most data in the CLDR, hour cycles are not stored on a per-locale
basis. Instead, they are keyed by a string that is usually a region, but
sometimes is a locale. Therefore, given a locale, to determine the hour
cycles for that locale, we:
1. Check if the locale itself is assigned hour cycles.
2. If the locale has a region, check if that region is assigned hour
cycles.
3. Otherwise, maximize that locale, and if the maximized locale has
a region, check if that region is assigned hour cycles.
4. If the above all fail, fallback to the "001" region.
Further, each locale's default hour cycle is the first assigned hour
cycle.
This hasn't mattered yet by chance, because the source for all enums
contains names of the same case. But the enum generated for hour cycle
regions will have mixed case. Sort them case-insensitively in order to
traverse these names in the same order in both generate_enum and
generate_mapping.
Similar to number formatting, the data for date-time formatting will be
located in its own generated file. This extracts the cldr-dates package
from the CLDR and sets up the generator plumbing to create the date-time
data files.
Currently, we generate separate data files for locale and number format
related tables/methods, but provide public accessors for all of the data
in one Locale.h file. Rather than continuing this trend for date-time,
relative time, etc. formatting, it's a bit easier to reason about if the
public accessors are also in separate files.
At the moment we just check if we *can* render a simple triangle, we do
not yet actually test if the image is indeed the triangle we wanted.
This test also outputs the rendered image when GL_DEBUG is enabled to a
file called "picture.bmp" for manual verification.
Co-authored-by: sunverwerth <s.unverwerth@serenityos.org>
Previously, a libc-like out-of-line error information was used in the
loader and its plugins. Now, all functions that may fail to do their job
return some sort of Result. The universally-used error type ist the new
LoaderError, which can contain information about the general error
category (such as file format, I/O, unimplemented features), an error
description, and location information, such as file index or sample
index.
Additionally, the loader plugins try to do as little work as possible in
their constructors. Right after being constructed, a user should call
initialize() and check the errors returned from there. (This is done
transparently by Loader itself.) If a constructor caused an error, the
call to initialize should check and return it immediately.
This opportunity was used to rework a lot of the internal error
propagation in both loader classes, especially FlacLoader. Therefore, a
couple of other refactorings may have sneaked in as well.
The adoption of LibAudio users is minimal. Piano's adoption is not
important, as the code will receive major refactoring in the near future
anyways. SoundPlayer's adoption is also less important, as changes to
refactor it are in the works as well. aplay's adoption is the best and
may serve as an example for other users. It also includes new buffering
behavior.
Buffer also gets some attention, making it OOM-safe and thereby also
propagating its errors to the user.
This wasn't particularly difficult, and there's not much use for the
nicer interface yet either. While unveil() is of limited use in js(1)
as it should be able to open arbitrary files, I feel like we should be
able to add a pledge() call.
As noted by ECMA-402, if a supported locale contains all of a language,
script, and region subtag, then the implementation must also support the
locale without the script subtag. The most complicated example of this
is the zh-TW locale.
The list of locales in the CLDR database does not include zh-TW or its
maximized zh-Hant-TW variant. Instead, it inlcudes the zh-Hant locale.
However, zh-Hant-TW is listed in the default-content locale list in the
cldr-core package. This defines an alias from zh-Hant-TW to zh-Hant. We
must then also support the zh-Hant-TW alias without the script subtag:
zh-TW. This transitively maps zh-TW to zh-Hant, which is a case quite
heavily tested by test262.
Previously, we were just copying the locale data into default-content
locales (for example, copying the "en" data into "en-US"). Instead, we
can just define the default-content locales as aliases to their main
locales.
This will be used for locale aliases as well. Also rename the "property"
field in this struct to "name", as it no longer is only used for
property aliases.
Also add slightly richer parse errors now that we can include a string
literal with returned errors.
This will allow us to use TRY() when working with JSON data.
This wasn't the case for compact patterns, but unit patterns can contain
multiple (up to 2, really) identifiers that must each be recognized by
LibJS.
Each generated NumberFormat object now stores an array of identifiers
parsed. The format pattern itself is encoded with the index into this
array for that identifier, e.g. the compact format string "0K" will
become "{number}{compactIdentifier:0}".
This field is currently used to store the StringView into the compact
name/symbol in the format string. Units will need to store a similar
field, so rename the field to be more generic, and extract the parser
for it.
The compact scale of each formatting rule was precomputed in commit:
be69eae651
Using the formula: compact scale = magnitude - pattern scale
This computation was off-by-one.
For example, consider the format key "10000-count-one", which maps to
"00 thousand" in en-US. What we are really after is the exponent that
best represents the string "thousand" for values greater than 10000
and less than 100000 (the next format key). We were previously doing:
log10(10000) - "00 thousand".count("0") = 2
Which clearly isn't what we want. Instead, if we do:
log10(10000) + 1 - "00 thousand".count("0") = 3
We get the correct exponent for each format key for each locale.
This commit also renames the generated variable from "compact_scale" to
"exponent" to match the terminology used in ECMA-402.
For example, in en-US, the decimal, long compact pattern for numbers
between 10,000 and 100,000 is "00 thousand". In that pattern, "thousand"
is the compact identifier, and the generated format pattern is now
"{number} {compactIdentifier}". This also generates that identifier as
its own field in the NumberFormat structure.
Most locales have a single grouping size (the number of integer digits
to be written before inserting a grouping separator). However some have
a primary and secondary size. We parse the primary size as the size used
for the least significant integer digits, and the secondary size for the
most significant.
In order to implement Intl.NumberFormat.prototype.formatToParts, do not
replace {currency} keys in the format pattern before ECMA-402 tells us
to. Otherwise, the array return by formatToParts will not contain the
expected currency key.
Early replacement was done to avoid resolving the currency display more
than once, as it involves a couple of round trips to search through
LibUnicode data. So this adds a non-standard method to NumberFormat to
do this resolution and cache the result.
Another side effect of this change is that LibUnicode must replace unit
format patterns of the form "{0} {1}" during code generation. These were
previously skipped during code generation because LibJS would just
replace the keys with the currency display at runtime. But now that the
currency display injection is delayed, any {0} or {1} keys in the format
pattern will cause PartitionNumberPattern to abort.
Currencies are a bit strange; the layout of currency data in the CLDR is
not particularly compatible with what ECMA-402 expects. For example, the
currency format in the "en" and "ar" locales for the Latin script are:
en: "¤#,##0.00"
ar: "¤\u00A0#,##0.00"
Note how the "ar" locale has a non-breaking space after the currency
symbol (¤), but "en" does not. This does not mean that this space will
appear in the "ar"-formatted string, nor does it mean that a space won't
appear in the "en"-formatted string. This is a runtime decision based on
the currency display chosen by the user ("$" vs. "USD" vs. "US dollar")
and other rules in the Unicode TR-35 spec.
ECMA-402 shies away from the nuances here with "implementation-defined"
steps. LibUnicode will store the data parsed from the CLDR however it is
presented; making decisions about spacing, etc. will occur at runtime
based on user input.
For example, there isn't a unique set of data for the en-US locale;
rather, it defaults to the data for the en locale. See this commit for
much more detail: 357c97dfa8
These are used when formatting a number as currency with a display
option of "name" (e.g. for USD, the name is "US Dollars" in en-US).
These patterns appear in the CLDR in a different manner than other
number formats that are pluralized. They are of the form "{0} {1}",
therefore do not undergo subpattern replacements.
Currently, LibUnicode is only parsing and generating the "long" style of
currency display names. However, the CLDR contains "short" and "narrow"
forms as well that need to be handled. Parse these, and update LibJS to
actually respect the "style" option provided by the user for displaying
currencies with Intl.DisplayNames.
Note: There are some discrepencies between the engines on how style is
handled. In particular, running:
new Intl.DisplayNames('en', {type:'currency', style:'narrow'}).of('usd')
Gives:
SpiderMoney: "USD"
V8: "US Dollar"
LibJS: "$"
And running:
new Intl.DisplayNames('en', {type:'currency', style:'short'}).of('usd')
Gives:
SpiderMonkey: "$"
V8: "US Dollar"
LibJS: "$"
My best guess is V8 isn't handling style, and just returning the long
form (which is what LibJS did before this commit). And SpiderMoney can
handle some styles, but if they don't have a value for the requested
style, they fall back to the canonicalized code passed into of().
The data used for number formatting is going to grow quite a bit when
the cldr-units package is parsed. To prevent the generated UnicodeLocale
file from growing outrageously large, the number formatting data can go
into its own file. To prepare for this, move code that will be common
between the generators for UnicodeLocale and UnicodeNumberFormat to the
utility header.
This will be needed for the ComputeExponentForMagnitude AO for compact
formatting, namely step 5b:
Let exponent be an implementation- and locale-dependent (ILD) integer
by which to scale a number of the given magnitude in compact notation
for the current locale.
A number formatting pattern in the CLDR contains one or two entries,
delimited by a semi-colon. Previously, LibUnicode was just storing the
entire pattern as one string. This changes the generator to split the
pattern on that delimiter and generate the 3 unique patterns expected by
ECMA-402.
The rules for generating the 3 patterns are as follows:
* If the pattern contains 1 entry, it is the zero pattern. The positive
pattern is the zero pattern prepended with {plusSign}. The negative
pattern is the zero pattern prepended with {minusSign}.
* If the pattern contains 2 entries, the first is the zero pattern, and
the second is the negative pattern. The positive pattern is the zero
pattern prepended with {plusSign}.
The number system data in the CLDR contains information on how to format
numbers in a locale-dependent manner. Start parsing this data, beginning
with numeric symbol strings. For example the symbol NaN maps to "NaN" in
the en-US locale, and "非數值" in the zh-Hant locale.
Some locales in the CLDR have alternate default numbering systems listed
under "defaultNumberingSystem-alt-*", e.g.:
"defaultNumberingSystem": "arab",
"defaultNumberingSystem-alt-latn": "latn",
"otherNumberingSystems": {
"native": "arab"
},
We were previously only parsing "defaultNumberingSystem" and
"otherNumberingSystems". This odd format appears to be an artifact of
converting from XML.
This isn't particularly important because this generates code that is
quite hidden from outside callers. But when viewing the generated code,
it's a bit nicer to read e.g. enum identifiers such as "MinusSign"
rather than "Minussign".
First off, this verifies that an initial value is always provided in
Properties.json for each property.
Second, it verifies that parsing that initial value succeeds.
This means that a call to `property_initial_value()` will always return
a valid StyleValue. :^)
This file contains the list of locales which default to their parent
locale's values. In the core CLDR dataset, these locales have their own
files, but they are empty (except for identity data). For example:
https://github.com/unicode-org/cldr/blob/main/common/main/en_US.xml
In the JSON export, these files are excluded, so we currently are not
recognizing these locales just by iterating the locale files.
This is a prerequisite for upgrading to CLDR version 40. One of these
default-content locales is the popular "en-US" locale, which defaults to
"en" values. We were previously inferring the existence of this locale
from the "en-US-POSIX" locale (many implementations, including ours,
strip variants such as POSIX). However, v40 removes the "en-US-POSIX"
locale entirely, meaning that without this change, we wouldn't know that
"en-US" exists (we would default to "en").
For more detail on this and other v40 changes, see:
https://cldr.unicode.org/index/downloads/cldr-40#h.nssoo2lq3cba
This changes Web::Bindings::throw_dom_exception_if_needed() to return a
JS::ThrowCompletionOr instead of an Optional. This allows callers to
wrap the invocation with a TRY() macro instead of making a follow-up
call to should_return_empty(). Further, this removes all invocations to
vm.exception() in the generated bindings.
This also required converting URLSearchParams::for_each and the callback
function it invokes to ThrowCompletionOr. With this, the ReturnType enum
used by WrapperGenerator is removed as all callers would be using
ReturnType::Completion.
Both at the same time because many of them call construct() in call()
and I'm not keen on adding a bunch of temporary plumbing to turn
exceptions into throw completions.
Also changes the return value of construct() to Object* instead of Value
as it always needs to return an object; allowing an arbitrary Value is a
massive foot gun.
The old versions were renamed to JS_DECLARE_OLD_NATIVE_FUNCTION and
JS_DEFINE_OLD_NATIVE_FUNCTION, and will be eventually removed once all
native functions were converted to the new format.
Adds support for methods whose last parameter is a variadic DOMString.
We constructor a Vector<String> of the remaining arguments to pass to
the C++ implementation.
Note our Attribute class is what the spec refers to as just "Attr". The
main differences between the existing implementation and the spec are
just that the spec defines more fields.
Attributes can contain namespace URIs and prefixes. However, note that
these are not parsed in HTML documents unless the document content-type
is XML. So for now, these are initialized to null. Web pages are able to
set the namespace via JavaScript (setAttributeNS), so these fields may
be filled in when the corresponding APIs are implemented.
The main change to be aware of is that an attribute is a node. This has
implications on how attributes are stored in the Element class. Nodes
are non-copyable and non-movable because these constructors are deleted
by the EventTarget base class. This means attributes cannot be stored in
a Vector or HashMap as these containers assume copyability / movability.
So for now, the Vector holding attributes is changed to hold RefPtrs to
attributes instead. This might change when attribute storage is
implemented according to the spec (by way of NamedNodeMap).
This adds the ParamatizedType, as `Vector<String>` doesn't encode the
full type information. It is a separate struct as you can't have
`Vector<Type>` inside of `Type`. This also makes Type RefCounted
because I had to make parse_type return a pointer to make dynamic
casting work correctly.
The reason I made it RefCounted instead of using a NonnullOwnPtr is
because it causes compiler errors that I don't want to figure out right
now.
Apparently it breaks the fuzzer build. There's probably a better fix
for this, but for now just unbreak the fuzzer build.
Keep this for non-fuzzer builds though since it's apparently a 17%
speedup for running test262 tests :^)
Lagom: Build with -fno-no-semantic-interposition
We build with this in non-lagom builds, and serenity's gcc even adds it
to its CC1_SPEC. Let's use it for lagom too.
Reduces the number of dynamic relocations in liblagom-js.so.0.0.0 (per
`objdump -R`) from 15133 to 14534, and increases its size back to 91M
(95156800 bytes), probably due to more inlining being possible.
This might help perf of lagom binaries.
We build with this in non-lagom builds, so there's no reason not
to use it in lagom builds as well.
Reduces the size of liblagom-js.so.0.0.0 from 94M to 90M
(from 98352784 to 93831056 bytes to be exact).
Typically size_t is used for indices, but we can take advantage of the
knowledge that there is approximately only 46K unique strings in the
generated UnicodeLocale.cpp file. Therefore, we can get away with using
u16 to store indices. There is a VERIFY that will fail if we ever exceed
the limits of u16.
On x86_64 builds, this reduces libunicode.so from 9.2 MiB to 7.3 MiB.
On i686 builds, this reduces libunicode.so from 3.9 MiB to 3.3 MiB.
These savings are entirely in the .rodata section of the shared library.
Note there are a couple of type differences between the spec and the IDL
file added in this commit. For example, we will need to support a type
of Variant to handle spec types such as "(double or sequence<double>)".
But for now, this allows web pages to construct an IntersectionObserver
with any valid type.
The *_from_string() and resolve_*_alias() generated methods are the last
remaining users of HashMap in the LibUnicode generated files (read: the
last methods not using compile-time structures). This converts these
methods to use an array containing pairs of hash values to the desired
lookup value.
Because this code generation is the same between GenerateUnicodeData.cpp
and GenerateUnicodeLocale.cpp, this adds a GeneratorUtil.h header to the
LibUnicode generators to contain the method that generates the methods.
This concept is not present in ECMAScript, and it bothers me every time
I see it.
It's only used by WrapperGenerator, and even there only relevant in two
places, so let's fully remove it from LibJS and use a simple ternary
expression instead:
cpp_name = js_name.is_null() && legacy_null_to_empty_string
? String::empty()
: js_name.to_string(global_object);
Previously this would generate the following code:
JS::Value foo_value;
if (!foo.is_undefined())
foo_value = foo;
Which is dangerous as we're passing an empty value around, which could
be exposed to user code again. This is fine with "= null", for which it
also generates:
else
foo_value = JS::js_null();
So, in summary: a value of type `any`, not `required`, with no default
value and no initializer from user code will now default to undefined
instead of an empty value.
Meta/Lagom/ReadMe.md never had any other name; not sure how that typo
happened.
The link to the non-existent directory is especially vexing because the
text goes on to explain that we don't want such a directory to exist.
Found by running markdown-checker, and 'wget'ing all external links.
The list-format strings used for Intl.ListFormat are small, but quite
heavily duplicated. For example, the string "{0}, {1}" appears 6,519
times. Generate unique strings for this data to avoid duplication.
In the generated UnicodeLocale.cpp file, there are 296,408 strings for
localizations of languages, territories, scripts, currencies & keywords.
Of these, only 43,848 (14.8%) are actually unique, so there are quite a
large number of duplicated strings.
This generates a single compile-time array to store these strings. The
arrays for the localizations now store an index into this single array
rather than duplicating any strings.
Some CLDR languages.json / territories.json files contain localizations
for some lanuages/territories that are otherwise not present in the CLDR
database. We already don't generate anything in UnicodeLocale.cpp for
these anomalies, but this will stop us from even storing that data in
the generator's memory.
This doesn't affect the output of the generator, but will have an effect
after an upcoming commit to unique-ify all of the strings in the CLDR.
There are only 112 code points with special casing rules, so this array
is quite small (compared to the size 34,626 UnicodeData hash map that is
also storing this data). Removing all casing rules from UnicodeData will
happen in a subsequent commit.
Currently, all casing information (simple and special) are stored in a
compile-time array of size 34,626, then statically copied to a hash map
at runtime. In an effort to reduce the resulting memory usage, store the
simple casing rules in standalone compile-time arrays. The uppercase map
is size 1,450 and the lowercase map is size 1,433. Any code point not in
a map will implicitly have an identity mapping.
Having IDL constructors call FooWrapper::create(impl) directly was
creating a wrapper directly without telling the impl object about the
wrapper. This meant that we had wrapped C++ objects with a null
wrapper() pointer.
This introduces 3 classes: NodeList, StaticNodeList and LiveNodeList.
NodeList is the base of the static and live versions. Static is a
snapshot whereas live acts on the underlying data and thus inhibits
the same issues we have currently with HTMLCollection.
They were split into separate classes to not have them weirdly
mis-mashed together.
The create functions for static and live both return a NNRP to the base
class. This is to prevent having to do awkward casting at creation
and/or return, as the bindings expect to see the base NodeList only.
Instead of setting it to the default object prototype and then
immediately setting it again via internal_set_prototype_of, we can just
set it directly in the parent constructor call.
Since we don't support IDL typedefs or unions yet, the responsibility
of verifying the type of the argument is temporarily moved from the
generated Wrapper to the implementation.
This patch makes both of these classes inherit from RefCounted and
Bindings::Wrappable, plus some minimal rejigging to allow us to keep
using them internally while also exposing them to web content.
This adds support for the [Unscopable] extended attribute to attributes
and functions.
I believe it should be applicable to all interface members, but I
haven't done that here.
This currently only supports pair iterables (i.e. iterable<key, value>)
support for value iterables (i.e. iterable<value>) is left as TODO().
Since currently our cmake setup calls the WrapperGenerator separately
and unconditionally for each (hard-coded) output file iterable wrappers
have to be explicitly marked so in the CMakeLists.txt declaration, we
could likely improve this in the future by querying WrapperGenerator
for the outputs based on the IDL.
This patch essentially just splits the non return-specific logic from
generate_return_statement (i.e. the wrapping of the cpp value into
a javascript one) into a separate function generate_wrap_statement that
can be used to wrap any cpp value during wrapper generation.
This custom attribute will be used for objects that hold onto arbitrary
JS::Value's. This is needed as JS::Handle can only be constructed for
objects that implement JS::Cell, which JS::Value doesn't.
This works by overriding the `visit_edges` function in the wrapper.
This overridden function calls the base `visit_edges` and then forwards
it to the underlying implementation.
This will be used for CustomEvent, which must hold onto an arbitrary
JS::Value for it's entire lifespan.
A legacy platform object is a non-global platform object that
implements a special operation. A special operation is a getter, setter
and/or deleter. This is particularly used for old collection types,
such as HTMLCollection, NodeList, etc.
This will be used to make these spec-compliant and remove their custom
wrappers. Additionally, it will be used to implement collections that
we don't have yet, such as DOMStringMap.
This does a few things, that are hard to separate. For a while now, it's
been confuzing what `StyleValue::is_foo()` actually means. It sometimes
was used to check the type, and sometimes to see if it could return a
certain value type. The new naming scheme is:
- `is_length()` - is it a LengthStyleValue?
- `as_length()` - casts it to LengthStyleValue
- `has_length()` - can it return a Length?
- `to_length()` - gets the internal value out (eg, Length)
This also means, no more `static_cast<LengthStyleValue const&>(*this)`
stuff when dealing with StyleValues. :^)
Hopefully this will be a bit clearer going forward. There are lots of
places using the original methods, so I'll be going through them to
hopefully catch any issues.
These now crash as VM::call() uses ThrowExceptionOr<T>, which refuses to
hold an empty JS::Value as its non-exception result.
We only need to return an empty value when should_return_empty() says
so for the return value of throw_dom_exception_if_needed().
Co-authored-by: Luke Wilde <lukew@serenityos.org>
For `number` and `integer` types, you can add a range afterwards to add
a range check, using similar syntax to that used in the CSS specs. For
example:
```json
"font-weight": {
...
"valid-types": [
"number [1,1000]"
],
...
}
```
This limits any numbers to the range `1 <= n <= 1000`.
Previously, we have not been validating the values for CSS declarations
inside the Parser. This causes issues, since we should be discarding
invalid style declarations, so that previous ones are used instead. For
example, in this code:
```css
.foo {
width: 2em;
width: orange;
}
```
... the `width: orange` declaration overwrites the `width: 2em` one,
even though it is invalid. According to the spec, `width: orange` should
be rejected at parse time, and discarded, leaving `width: 2em` as the
resulting value.
Many properties (mostly shorthands) are parsed specially, and so they
are already rejected if they are invalid. But for simple properties, we
currently accept any value. With `property_accepts_value()`, we can
check if the value is valid in `parse_css_value()`, and reject it if it
is not.
We already expand shorthands in the cascade, so there's no need to
preserve them in the output.
This patch reorganizes the CSS::PropertyID enum values so that we can
easily iterate over all shorthand or longhand properties.
This patch adds a basic initial implementation of these API's.
Since LibWeb currently doesn't support workers, this implementation of
messaging doesn't bother with serializing and deserializing messages.
When parsing shorthand values, we'd like to use
`property_initial_value()` to get their longhand property values,
instead of hard-coding them as we currently do. That involves
recursively calling that function while the `initial_values` map is
being initialized, which causes problems because the shorthands appear
alphabetically before their longhand components, so the longhands aren't
initialized yet!
The solution here is to perform 2 passes when generating the code,
outputting properties without "longhands" first, and the rest after.
This could potentially cause issues when shorthands have multiple
levels, in particular `border` -> `border-color` -> `border-left-color`.
But, we do not currently define a default value for `border`, and
`border-color` takes only a single value, so it's fine for now. :^)
Multi-lib distros like Gentoo and Fedora install lagom-core.so into
lagom-install/lib64 rather than lib. Set the install RPATH based on
CMAKE_INSTALL_LIBDIR to avoid the wrong path being set in the binaries.
Also apply macOS specific RPATH rules to fix the build on that platform.
Replace the old logic where we would start with a host build, and swap
all the CMake compiler and target variables underneath it to trick
CMake into building for Serenity after we configured and built the Lagom
code generators.
The SuperBuild creates two ExternalProjects, one for Lagom and one for
Serenity. The Serenity project depends on the install stage for the
Lagom build. The SuperBuild also generates a CMakeToolchain file for the
Serenity build to use that replaces the old toolchain file that was only
used for Ports.
To ensure that code generators are rebuilt when core libraries such as
AK and LibCore are modified, developers will need to direct their manual
`ninja` invocations to the SuperBuild's binary directory instead of the
Serenity binary directory.
This commit includes warning coalescing and option style cleanup for the
affected CMakeLists in the Kernel, top level, and runtime support
libraries. A large part of the cleanup is replacing USE_CLANG_TOOLCHAIN
with the proper CMAKE_CXX_COMPILER_ID variable, which will no longer be
confused by a host clang compiler.
This common strategy of having a serenity_option() macro defined in
either the Lagom or top level CMakeLists.txt allows us to do two things:
First, we can more clearly see which options are Serenity-specific,
Lagom-specific, or common between the target and host builds.
Second, it enables the upcoming SuperBuild changes to set() the options
in the SuperBuild's CMake cache and forward each target's options to the
corresponding ExternalProject.
This makes it so we don't need to specify the full path to all the
helper scripts we include() from different places in the codebase and
feels a lot cleaner.
We'll use this to prevent repeating common tool dependencies. They all
depend on LibCore and AK only. We also want to encapsulate common
install rules for them.