In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:
```
Optional<I> opt;
if constexpr (IsSigned<I>)
opt = view.to_int<I>();
else
opt = view.to_uint<I>();
```
For us.
The main goal here however is to have a single generic number conversion
API between all of the String classes.
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
These functions all have a very common case that can be dealt with a
very simple inline check, often avoiding the need to call an out-of-line
function. This patch moves the common case to inline functions in a new
ValueInlines.h header (necessary due to header dependency issues..)
8% speed-up on the entire Kraken benchmark :^)
When formatting a currency style pattern with compact notation, we were
(trying to) doubly insert the currency symbol into the formatted string.
We would first look up the currency pattern in GetNumberFormatPattern
(for the en locale, this is "¤#,##0.00", which our generator transforms
to "{currency}{number}").
When we hit the "{number}" field, NumberFormat will do a second lookup
for the compact pattern to use for the number being formatted. By using
the currency compact patterns, we receive a second pattern that also has
the currency symbol (for the en locale, if formatting the number 1000,
this is "¤0K", which our generator transforms to
"{currency}{number}{compactIdentifier:0}". This second lookup is not
supposed to have currency symbols (or any other symbols), thus we hit a
VERIFY_NOT_REACHED().
Instead, we are meant to use the decimal compact pattern, and allow the
currency symbol to be handled by only the outer currency pattern.
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
These APIs only perform small allocations, and are only used by LibJS.
Callers which could only have failed from these APIs are also made to
be infallible here.
This proposal has been merged into the main ECMA-402 spec. See:
https://github.com/tc39/ecma402/commit/4257160
Note this includes some editorial and normative changes made when the
proposal was merged into the main spec, but are not in the proposal spec
itself. In particular, the following AOs were changed:
PartitionNumberRangePattern (normative)
SetNumberFormatDigitOptions (editorial)
To add grouping to a number, we take a string such as "123456.123" and
break it into integer and fraction parts. Then we take the integer part
and break it into locale-specific sized groups to inject the locale's
group separator (e.g. a comma in en-US). We currently create new strings
for each of these groups. Instead, we can use the shared superstring
method to avoid all of that string copying.
First, this adds an overload of PrimitiveString::create for StringView.
This overload will throw an OOM completion if creating a String fails.
This is not only a bit more convenient, but it also ensures at compile
time that all PrimitiveString::create(string_view) invocations will be
handled as String and OOM-aware.
Next, this wraps all invocations to PrimitiveString::create(string_view)
with MUST_OR_THROW_OOM.
A small PrimitiveString::create(DeprecatedFlyString) overload also had
to be added to disambiguate between the StringView and DeprecatedString
overloads.
This is a normative change in the Intl.NumberFormat V3 spec. See:
https://github.com/tc39/proposal-intl-numberformat-v3/commit/08f599b
Note that this didn't seem to actually affect our implementation. The
Unicode spec states:
https://www.unicode.org/reports/tr35/tr35-53/tr35-numbers.html#Plural_Ranges
"If there is no value for a <start,end> pair, the default result is end"
Therefore, our implementation did not have the behavior noted by the
issue this normative change addressed:
const pr = new Intl.PluralRules("en-US");
pr.selectRange(1, 1); // Is "other", should be "one"
Our implementation already returned "one" here because there is no such
<start=one, end=one> value in the CLDR for en-US. Thus, we already
returned the end value of "one".
This is a normative change in the Intl.NumberFormat V3 spec. See:
https://github.com/tc39/proposal-intl-numberformat-v3/commit/23e69cf
This isn't particularly testable because every locale in the CLDR has a
non-empty "approximatelySign" field in cldr-numbers-modern. The issue
for this change seems to be considering the "miscPatterns/approximately"
field instead, which has different semantics. But as noted on the CLDR
issue https://unicode-org.atlassian.net/browse/CLDR-14918, the ICU uses
the "approximatelySign" field (as do our implementation).
It turns out return a ThrowCompletionOr<T const&> is flawed, as the GCC
expansion trick used with TRY will always make a copy. PrimitiveString
is luckily the only such use case.
This makes construction of Utf16String fallible in OOM conditions. The
immediate impact is that PrimitiveString must then be fallible as well,
as it may either transcode UTF-8 to UTF-16, or create a UTF-16 string
from ropes.
There are a couple of places where it is very non-trivial to propagate
the error further. A FIXME has been added to those locations.
This constructor was easily confused with a copy constructor, and it was
possible to accidentally copy-construct Objects in at least one way that
we dicovered (via generic ThrowCompletionOr construction).
This patch adds a mandatory ConstructWithPrototypeTag parameter to the
constructor to disambiguate it.
Note that js_rope_string() has been folded into this, the old name was
misleading - it would not always create a rope string, only if both
sides are not empty strings. Use a three-argument create() overload
instead.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
- Prefer VM::current_realm() over GlobalObject::associated_realm()
- Prefer VM::heap() over GlobalObject::heap()
- Prefer Cell::vm() over Cell::global_object()
- Prefer Wrapper::vm() over Wrapper::global_object()
- Inline Realm::global_object() calls used to access intrinsics as they
will later perform a direct lookup without going through the global
object