Commit graph

49 commits

Author SHA1 Message Date
Timothy Flynn
5b68c1a06c LibJS: Use Intl.PluralRules within Intl.NumberFormat
This also allows removing a bit of a BigInt hack to resolve plurality of
BigInt numbers (because the AOs used in ResolvePlural support BigInt,
wherease the naive Unicode::select_pattern_with_plurality did not).

We use cardinal form here; the number format patterns in the CLDR align
with the cardinal form of the plural rules.
2022-07-08 20:33:52 +02:00
Timothy Flynn
2982aa0373 LibJS: Mark the NumberFormat parameter of FormatNumericToString as const
Not critical, but in subsequent commits this will be invoked from a
constant context.
2022-07-08 11:51:54 +02:00
Linus Groh
9f3f3b0864 LibJS: Remove implicit wrapping/unwrapping of completion records
This is an editorial change in the ECMA-262 spec, with similar changes
in some proposals.

See:
- https://github.com/tc39/ecma262/commit/7575f74
- https://github.com/tc39/proposal-array-grouping/commit/df899eb
- https://github.com/tc39/proposal-shadowrealm/commit/9eb5a12
- https://github.com/tc39/proposal-shadowrealm/commit/c81f527
2022-05-03 01:09:29 +02:00
Timothy Flynn
4fd463dae0 LibJS: Normalize mathematical references to negative zero
This is an editorial change in the Intl spec. See:
https://github.com/tc39/ecma402/commit/d4be24e
2022-03-30 14:24:32 +01:00
Timothy Flynn
1a76839e8d LibJS: Use consistent ASCII case-transformation and string language
Also update the incorrect spec link for IsWellFormedCurrencyCode.

These are editorial changes in the Intl spec. See:
https://github.com/tc39/ecma402/commit/6939b44
https://github.com/tc39/ecma402/commit/3a775eb
https://github.com/tc39/ecma402/commit/97a7940
https://github.com/tc39/ecma402/commit/129c790
https://github.com/tc39/ecma402/commit/42ec908
https://github.com/tc39/ecma402/commit/ea25c36
2022-03-30 14:24:32 +01:00
Timothy Flynn
7c41e6058a LibJS: Explicitly indicate infallible incovations
These are editorial changes in the Intl spec.

See:
https://github.com/tc39/ecma402/commit/6804096
https://github.com/tc39/ecma402/commit/6361167
https://github.com/tc39/ecma402/commit/8718171
https://github.com/tc39/ecma402/commit/fd37cb4
https://github.com/tc39/ecma402/commit/00fcfb0
https://github.com/tc39/ecma402/commit/913f832
2022-03-30 14:24:32 +01:00
Timothy Flynn
812d3a7ef8 LibJS: Reorganize spec steps for Intl.NumberFormat
This is an editorial change in the Intl spec:
https://github.com/tc39/ecma402/commit/110cb1f
2022-03-15 17:30:58 +01:00
Timothy Flynn
6efbafa6e0 Everywhere: Update copyrights with my new serenityos.org e-mail :^) 2022-01-31 18:23:22 +00:00
Timothy Flynn
d6e926e5b1 LibJS: Support BigInt number formatting with Intl.NumberFormat 2022-01-30 20:05:27 +00:00
Timothy Flynn
a0253af8c1 LibJS: Generalize Intl.NumberFormat to operate on Value types
Intl.NumberFormat is meant to format both Number and BigInt types. To
prepare for formatting BigInt types, this generalizes our NumberFormat
implementation to operate on Value instances rather than doubles. All
arithmetic is moved to static helpers that can now be updated with
BigInt semantics.
2022-01-30 20:05:27 +00:00
Timothy Flynn
ac3e42a8de LibJS: Move some Intl.NumberFormat fields into a NumberFormatBase class
Other Intl objects, such as PluralRules, are to be treated as a
NumberFormat object in some AOs. There's only a handful of fields which
are to be shared between those objects - move them to a base class for
shared reuse.

This also updates the couple of NumberFormat AOs that are meant to
operate on these NumberFormat-like objects.

Alternatively, we could just have objects like PluralRules inherit from
NumberFormat directly. But that messes up the is<NumberFormat> runtime
checks, so this feels safer.
2022-01-28 19:38:47 +00:00
Timothy Flynn
cf92bc42a2 LibJS: Respect per-locale minimum grouping digits when number formatting 2022-01-27 20:30:52 +00:00
Timothy Flynn
0865f71d37 LibJS: Convert Intl.NumberFormat to use Unicode::Style 2022-01-25 19:02:59 +00:00
Timothy Flynn
8126cb2545 LibJS+LibUnicode: Remove unnecessary locale currency mapping wrapper
Before LibUnicode generated methods were weakly linked, we had a public
method (get_locale_currency_mapping) for retrieving currency mappings.
That method invoked one of several style-specific methods that only
existed in the generated UnicodeLocale.

One caveat of weakly linked functions is that every such function must
have a public declaration. The result is that each of those styled
methods are declared publicly, which makes the wrapper redundant
because it is just as easy to invoke the method for the desired style.
2022-01-13 13:43:57 +01:00
Timothy Flynn
cc5e9f0579 LibJS+LibUnicode: Move replacement of number system digits to LibUnicode
There are a few algorithms in TR-35 that need to replace digits before
returning any results to callers. For example, when formatting time zone
offsets, a string like "GMT+12:34" must have its digits replaced with
the default numbering system for the desired locale.
2022-01-11 23:56:35 +01:00
mjz19910
10ec98dd38 Everywhere: Fix spelling mistakes 2022-01-07 15:44:42 +01:00
Timothy Flynn
f16f3c4677 LibJS: Update ToRawPrecision / ToRawFixed AO spec comments
This is a normative change in the Intl spec:
https://github.com/tc39/ecma402/commit/f0f66cf

There are two main changes here:
1. Converting BigInt/Number objects to mathematical values.
2. A change in how ToRawPrecision computes its exponent and significant
   digits.

For (1), we do not yet support BigInt number formatting, thus already
have coerced Number objects to a double. When BigInt is supported, the
number passed into these methods will likely still be a Value, thus can
be coereced then.

For (2), our implementation already returns the expected edge-case
results pointed out on the spec PR.
2022-01-02 20:07:03 +01:00
Timothy Flynn
a3149c11e5 LibJS: Explicitly handle postive/negative infinity in Intl.NumberFormat
This is a normative change in the Intl spec:
https://github.com/tc39/ecma402/commit/f0f66cf

Our implementation is unaffected by this change. LibUnicode pre-computes
positive, negative, and signless format patterns, so we already format
negative infinity correctly. Also, the CLDR does not contain specific
locale-dependent strings for negative infinity anyways.
2022-01-02 20:07:03 +01:00
Timothy Flynn
9ce4ff4265 LibJS: Avoid crashing when the Unicode data generators are disabled
The general idea when ENABLE_UNICODE_DATABASE_DOWNLOAD is OFF has been
that the Intl APIs will provide obviously incorrect results, but should
not crash. This regressed a bit with NumberFormat and DateTimeFormat.
2021-12-22 17:30:43 +01:00
Timothy Flynn
2a7f36b392 LibJS+LibUnicode: Generate unique numeric symbol lists
There are 443 number system objects generated, each of which held an
array of number system symbols. Of those 443 arrays, only 39 are unique.

To uniquely store these, this change moves the generated NumericSymbol
enumeration to the public LibUnicode/NumberFormat.h header with a pre-
defined set of symbols that we need. This is to ensure the generated,
unique arrays are created in a known order with known symbols. While it
is unfortunate to no longer discover these symbols at generation time,
it does allow us to ignore unwanted symbols and perform less string-to-
enumeration conversions at lookup time.
2021-12-11 14:17:47 +00:00
Timothy Flynn
d2588d852b LibJS: Change all [[RelevantExtensionKeys]] to return constexpr arrays
There's no need to allocate a vector for this internal slot. Similar to
commit: bb11437792
2021-12-01 16:36:26 +00:00
Timothy Flynn
914675e826 LibJS+LibUnicode: Separate number formatting methods from Locale.h
Currently, we generate separate data files for locale and number format
related tables/methods, but provide public accessors for all of the data
in one Locale.h file. Rather than continuing this trend for date-time,
relative time, etc. formatting, it's a bit easier to reason about if the
public accessors are also in separate files.
2021-11-29 22:48:46 +00:00
Timothy Flynn
251f692440 LibJS: Re-implement SetNumberFormatDigitOptions AO
This is an editorial change in the Intl spec.

See: https://github.com/tc39/ecma402/commit/d89c84f
2021-11-24 14:17:15 +00:00
Timothy Flynn
a1d5849e67 LibJS: Implement unit number formatting 2021-11-16 23:14:09 +00:00
Timothy Flynn
04b8b87c17 LibJS+LibUnicode: Support multiple identifiers within format pattern
This wasn't the case for compact patterns, but unit patterns can contain
multiple (up to 2, really) identifiers that must each be recognized by
LibJS.

Each generated NumberFormat object now stores an array of identifiers
parsed. The format pattern itself is encoded with the index into this
array for that identifier, e.g. the compact format string "0K" will
become "{number}{compactIdentifier:0}".
2021-11-16 23:14:09 +00:00
Timothy Flynn
3b68370212 LibJS+LibUnicode: Rename the generated compact_identifier to identifier
This field is currently used to store the StringView into the compact
name/symbol in the format string. Units will need to store a similar
field, so rename the field to be more generic, and extract the parser
for it.
2021-11-16 23:14:09 +00:00
Timothy Flynn
6d34a0b4e8 LibJS+LibUnicode: Rename method to select a NumberFormat plurality
Instead of currency pattern lookups within select_currency_unit_pattern,
rename the method to select_pattern_with_plurality and accept any list
of patterns. This method will be needed for units.
2021-11-16 23:14:09 +00:00
Timothy Flynn
99c15741ba LibJS: Conditionally ignore [[UseGrouping]] in compact notation 2021-11-16 00:56:55 +00:00
Timothy Flynn
14aca03161 LibJS: Remove FIXME comment from PartitionNotationSubPattern AO
All possible patterns generated by LibUnicode are now handled. We have a
similar VERIFY_NOT_REACHED in PartitionNumberPattern.
2021-11-16 00:56:55 +00:00
Timothy Flynn
fdae323401 LibJS: Implement compact formatting for Intl.NumberFormat 2021-11-16 00:56:55 +00:00
Timothy Flynn
80b86d20dc LibJS: Cache the number format used for compact notation
Finding the best number format to use for compact notation involves
creating a Vector of all compact formats for the locale and looking for
the one that best matches the number's magnitude. ECMA-402 wants this
number format to be found multiple times, so cache the result for future
use.
2021-11-16 00:56:55 +00:00
Timothy Flynn
1f546476d5 LibJS+LibUnicode: Fix computation of compact pattern exponents
The compact scale of each formatting rule was precomputed in commit:
be69eae651

Using the formula: compact scale = magnitude - pattern scale

This computation was off-by-one.

For example, consider the format key "10000-count-one", which maps to
"00 thousand" in en-US. What we are really after is the exponent that
best represents the string "thousand" for values greater than 10000
and less than 100000 (the next format key). We were previously doing:

    log10(10000) - "00 thousand".count("0") = 2

Which clearly isn't what we want. Instead, if we do:

    log10(10000) + 1 - "00 thousand".count("0") = 3

We get the correct exponent for each format key for each locale.

This commit also renames the generated variable from "compact_scale" to
"exponent" to match the terminology used in ECMA-402.
2021-11-16 00:56:55 +00:00
Timothy Flynn
4d79ab6866 LibJS: Implement engineering and scientific number formatting 2021-11-14 17:00:35 +00:00
Timothy Flynn
15c5fbd9e9 LibJS: Implement number grouping for Intl.NumberFormat
For example, in en-US, the number 123456 should be formatted as the
string "123,456". In en-IN, it should be formatted as "1,23,456".
2021-11-14 10:35:19 +00:00
Timothy Flynn
3450def494 LibJS: Implement Intl.NumberFormat.prototype.formatToParts 2021-11-13 19:01:25 +00:00
Timothy Flynn
c65dea64bd LibJS+LibUnicode: Don't remove {currency} keys in GetNumberFormatPattern
In order to implement Intl.NumberFormat.prototype.formatToParts, do not
replace {currency} keys in the format pattern before ECMA-402 tells us
to. Otherwise, the array return by formatToParts will not contain the
expected currency key.

Early replacement was done to avoid resolving the currency display more
than once, as it involves a couple of round trips to search through
LibUnicode data. So this adds a non-standard method to NumberFormat to
do this resolution and cache the result.

Another side effect of this change is that LibUnicode must replace unit
format patterns of the form "{0} {1}" during code generation. These were
previously skipped during code generation because LibJS would just
replace the keys with the currency display at runtime. But now that the
currency display injection is delayed, any {0} or {1} keys in the format
pattern will cause PartitionNumberPattern to abort.
2021-11-13 19:01:25 +00:00
Timothy Flynn
d872d030f1 LibJS: Avoid potential for dangling string views in partition patterns
There aren't any dangling views in as of yet, but a subsequent commit
will cause the "part" variable to be a view into an internally generated
string. Therefore, after returning from PartitionNumberPattern, that
view will be pointed at freed memory.

This commit is to set the precendence of not returning a view to "part".
2021-11-13 19:01:25 +00:00
Timothy Flynn
a701ed52fc LibJS+LibUnicode: Fully implement currency number formatting
Currencies are a bit strange; the layout of currency data in the CLDR is
not particularly compatible with what ECMA-402 expects. For example, the
currency format in the "en" and "ar" locales for the Latin script are:

    en: "¤#,##0.00"
    ar: "¤\u00A0#,##0.00"

Note how the "ar" locale has a non-breaking space after the currency
symbol (¤), but "en" does not. This does not mean that this space will
appear in the "ar"-formatted string, nor does it mean that a space won't
appear in the "en"-formatted string. This is a runtime decision based on
the currency display chosen by the user ("$" vs. "USD" vs. "US dollar")
and other rules in the Unicode TR-35 spec.

ECMA-402 shies away from the nuances here with "implementation-defined"
steps. LibUnicode will store the data parsed from the CLDR however it is
presented; making decisions about spacing, etc. will occur at runtime
based on user input.
2021-11-13 11:52:45 +00:00
Timothy Flynn
89523f70cf LibJS: Begin implementing Intl.NumberFormat.prototype.format
There is quite a lot to be done here so this is just a first pass at
number formatting. Decimal and percent formatting are mostly working,
but only for standard and compact notation (engineering and scientific
notation are not implemented here). Currency formatting is parsed, but
there is more work to be done to handle e.g. using symbols instead of
currency codes ("$" instead of "USD"), and putting spaces around the
currency symbol ("USD 2.00" instead of "USD2.00").
2021-11-12 09:17:08 +00:00
Linus Groh
b7e5f08e56 LibJS: Convert Object::get() to ThrowCompletionOr
To no one's surprise, this patch is pretty big - this is possibly the
most used AO of all of them. Definitely worth it though.
2021-10-03 20:14:03 +01:00
Idan Horowitz
768009e005 LibJS: Convert NumberFormat AOs to ThrowCompletionOr 2021-09-18 22:59:15 +03:00
Idan Horowitz
407cf04884 LibJS: Convert get_number_option() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
6d3de03549 LibJS: Convert default_number_option() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
b9c7a629f8 LibJS: Convert coerce_options_to_object() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
de9785b71b LibJS: Convert Intl::get_option() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Idan Horowitz
3758e65293 LibJS: Convert canonicalize_locale_list() to ThrowCompletionOr 2021-09-18 22:21:15 +03:00
Timothy Flynn
7769cd2cab LibJS: Move number_format_relevant_extension_keys to Intl.NumberFormat
This method represents the Intl.NumberFormat's [[RelevantExtensionKeys]]
internal slot, so it makes more sense for this to be directly in the
class itself.
2021-09-12 12:57:17 +01:00
Timothy Flynn
94a5a0437c LibJS: Move Intl.NumberFormat's AOs to its object file 2021-09-12 12:57:17 +01:00
Timothy Flynn
07f12b108b LibJS: Implement a nearly empty Intl.NumberFormat object
This adds plumbing for the Intl.NumberFormat object, constructor, and
prototype.
2021-09-11 11:05:50 +01:00