0ct0pu5/ladybird

Author	SHA1	Message	Date
Timothy Flynn	dd89901b07	LibUnicode: Use GenericLexer to parse Unicode language IDs This is preparatory work to read locale extensions. The parser currently enforces that the entire string is consumed. But to parse extensions, parse_unicode_locale_id() will need parse_unicode_language_id() to just stop parsing on the first segment that does not match the language ID grammar. It will also need to know where the parsing stopped. Both of these needs are fulfilled by GenericLexer. The caveat is that we can no longer simply split the parsed string on separator characters. So parse_unicode_language_id() now operates as a small state machine.	2021-08-30 19:42:40 +01:00
Timothy Flynn	8b93d51212	LibUnicode: Parse Unicode CLDR currencies and generate locale mappings	2021-08-27 12:32:24 +01:00
Timothy Flynn	0f02def3c2	LibUnicode: Parse Unicode CLDR scripts and generate locale mappings	2021-08-27 12:32:24 +01:00
Timothy Flynn	ab7a1dd89e	LibUnicode: Parse Unicode CLDR languages and generate locale mappings	2021-08-27 12:32:24 +01:00
Timothy Flynn	6719e5cb17	LibUnicode: Generate locale subtag data as multiple smaller tables This commit is preemptive to upcoming commits which add more subtags to the CLDR generator. Rather than generating a giant HashMap containing all data, generate more (smaller) Array-based tables. This mimics the UCD generator. This also allows simpler lookups at runtime since we can generate index-based lookups into the smaller tables rather easily. Without this change, adding the remaining locale subtags would result in the generation and compilation of UnicodeLocale.cpp taking about 30s on my machine. With this change, it takes about half that. Additionally, the size of the generated file reduces by about 1.5MB.	2021-08-27 12:32:24 +01:00
Timothy Flynn	137e98cb6f	LibUnicode: Add public accessors to generated locale data	2021-08-26 22:04:09 +01:00
Timothy Flynn	b7a95cba65	LibUnicode: Implement grammar validators for Unicode TR-35 ECMA-402 requires validating user input against the EBNF grammar for Unicode locales described in TR-35: https://www.unicode.org/reports/tr35 This commit adds validators for that grammar, as well as other helper to e.g. canonicalize a locale string.	2021-08-26 22:04:09 +01:00

7 commits