0ct0pu5/ladybird

Author	SHA1	Message	Date
Timothy Flynn	4f2bcebe74	LibUnicode+LibJS: Store locale keyword values as a single string Previously, LibUnicode would store the values of a keyword as a Vector. For example, the locale "en-u-ca-abc-def" would have its keyword "ca" stored as {"abc, "def"}. Then, canonicalization would occur on each of the elements in that Vector. This is incorrect because, for example, the keyword value "true" should only be dropped if that is the entire value. That is, the canonical form of "en-u-kb-true" is "en-u-kb", but "en-u-kb-abc-true" does not change for canonicalization. However, we would canonicalize that locale as "en-u-kb-abc".	2021-09-08 21:08:48 +01:00
Timothy Flynn	a05419db55	LibUnicode: Add lexer to test if a string matches the "type" production	2021-09-02 17:56:42 +01:00
Timothy Flynn	fd0011989a	LibUnicode: Resolve the most likely territory alias when there are many	2021-09-01 14:14:47 +01:00
Timothy Flynn	72f49e42b4	LibUnicode: Perform complex Unicode locale alias substitution	2021-09-01 14:14:47 +01:00
Timothy Flynn	da89cf9afb	LibUnicode: Canonicalize calendar subtags Calendar subtags are a bit of an odd-man-out in that we must match the variants "ethiopic-amete-alem" in that order, without any other variant in the locale. So a separate method is needed for this, and we now defer sorting the variant list until after other canonicalization is done.	2021-09-01 14:14:47 +01:00
Timothy Flynn	8458f477a4	LibUnicode: Canonicalize timezone subtags	2021-09-01 14:14:47 +01:00
Timothy Flynn	335f985b31	LibUnicode: Canonicalize the subtag "imperial" to "uksystem"	2021-09-01 14:14:47 +01:00
Timothy Flynn	2d90144888	LibUnicode: Canonicalize the subtag "primary" and "tertiary" to "levelN"	2021-09-01 14:14:47 +01:00
Timothy Flynn	409f39b336	LibUnicode: Canonicalize the subtag "names" to "prprname"	2021-09-01 14:14:47 +01:00
Timothy Flynn	f907a7dc38	LibUnicode: Canonicalize the subtag "yes" to "true"	2021-09-01 14:14:47 +01:00
Timothy Flynn	556374a904	LibUnicode: Substitute Unicode locale aliases during canonicalization Unicode TR35 defines how locale subtag aliases should be emplaced when converting a locale to canonical form. For most subtags, it is a simple substitution. Language subtags depend on context; for example, the language "sh" should become "sr-Latn", but if the original locale has a script subtag already ("sh-Cyrl"), then only the language subtag of the alias should be taken ("sr-Latn"). To facilitate this, we now make two passes when canonicalizing a locale. In the first pass, we convert the LocaleID structure to canonical syntax (where the conversions all happen in-place). In the second pass, we form the canonical string based on the canonical syntax.	2021-09-01 14:14:47 +01:00
Timothy Flynn	d13142f015	LibJS+LibUnicode: Store parsed Unicode locale data as full strings Originally, it was convenient to store the parsed Unicode locale data as views into the original string being parsed. But to implement locale aliases will require mutating the data that was parsed. To prepare for that, store the parsed data as proper strings.	2021-09-01 14:14:47 +01:00
Timothy Flynn	f897c2edb3	LibUnicode: Canonicalize locale private use extensions	2021-08-30 19:42:40 +01:00
Timothy Flynn	6f0cb52dc4	LibUnicode: Canonicalize locale extensions	2021-08-30 19:42:40 +01:00
Timothy Flynn	30855e6663	LibUnicode: Parse locale private use extensions	2021-08-30 19:42:40 +01:00
Timothy Flynn	29f76ef7c8	LibUnicode: Parse locale extensions of the other extension form	2021-08-30 19:42:40 +01:00
Timothy Flynn	d2d304fcf8	LibUnicode: Parse locale extensions of the transformed extension form	2021-08-30 19:42:40 +01:00
Timothy Flynn	eda92d15e4	LibUnicode: Parse locale extensions of the Unicode locale extension form	2021-08-30 19:42:40 +01:00
Timothy Flynn	b7a95cba65	LibUnicode: Implement grammar validators for Unicode TR-35 ECMA-402 requires validating user input against the EBNF grammar for Unicode locales described in TR-35: https://www.unicode.org/reports/tr35 This commit adds validators for that grammar, as well as other helper to e.g. canonicalize a locale string.	2021-08-26 22:04:09 +01:00

19 commits