0ct0pu5/ladybird

Author	SHA1	Message	Date
Timothy Flynn	89d1813b5d	LibUnicode: Move CLDR data generators to a LibLocale subfolder To prepare for placing all CLDR generated data in a new library, LibLocale, this moves the code generators for the CLDR data to the LibLocale subfolder.	2022-09-05 14:37:16 -04:00
Timothy Flynn	becec3578f	LibTimeZone+LibUnicode: Generate string data with run-length encoding Currently, the unique string lists are stored in the initialized data sections of their shared libraries. In order to move the data to the read-only section, generate the strings using RLE arrays. We generate two arrays: the first is the RLE data itself, the second is a list of indices into the RLE array for each string. We then generate a decoding method to convert an RLE string to a StringView.	2022-08-16 16:56:17 +02:00
Timothy Flynn	ae2acc8cdf	LibJS+LibUnicode: Generate a set of default DateTimeFormat patterns This isn't called out in TR-35, but before ICU even looks at CLDR data, it adds a hard-coded set of default patterns to each locale's calendar. It has done this since 2006 when its DateTimeFormat feature was first created. Several test262 tests depend on this, which under ECMA-402, falls into "implementation defined" behavior. For compatibility, we can do the same in LibUnicode.	2022-07-22 23:51:56 +01:00
Timothy Flynn	32c07bc6c3	LibUnicode: Generate per-locale data for the "noon" fixed day period Note that not all locales have this day period.	2022-07-21 20:36:03 +01:00
Timothy Flynn	16b673eaa9	LibUnicode: Check whether a calendar symbol for a locale actually exists In the generated unique string list, index 0 is the empty string, and is used to indicate a value doesn't exist in the CLDR. Check for this before returning an empty calendar symbol. For example, an upcoming commit will add the fixed day period "noon", which not all locales support.	2022-07-21 20:36:03 +01:00
Timothy Flynn	0f26ab89ae	LibJS+LibUnicode: Handle flexible day periods on both sides of midnight Commit `ec7d535` only partially handled the case of flexible day periods rolling over midnight, in that it only worked for hours after midnight. For example, the en locale defines a day period range of [21:00, 06:00). The previous method of adding 24 hours to the given hour would change e.g. 23:00 to 47:00, which isn't valid.	2022-07-21 20:36:03 +01:00
Timothy Flynn	b24b9c0a65	LibUnicode: Fallback to per-locale default calendars When patterns, symbols, etc. for a requested calendar are not found, use the locale's default calendar.	2022-07-15 12:31:43 +02:00
sin-ack	3f3f45580a	Everywhere: Add sv suffix to strings relying on StringView(char const) Each of these strings would previously rely on StringView's char const constructor overload, which would call __builtin_strlen on the string. Since we now have operator ""sv, we can replace these with much simpler versions. This opens the door to being able to remove StringView(char const*). No functional changes.	2022-07-12 23:11:35 +02:00
sin-ack	7456904a39	Meta+Userland: Simplify some formatters These are mostly minor mistakes I've encountered while working on the removal of StringView(char const*). The usage of builder.put_string over Format<FormatString>::format is preferrable as it will avoid the indirection altogether when there's no formatting to be done. Similarly, there is no need to do format(builder, "{}", number) when builder.put_u64(number) works equally well. Additionally a few Strings where only constant strings were used are replaced with StringViews.	2022-07-12 23:11:35 +02:00
Timothy Flynn	12e7c0808a	LibUnicode: Generate per-region week data This includes: * The minimum number of days in a week for that week to count as the first week of a new year. * The day to be shown as the first day of the week in a calendar. * The start/end days of the weekend. Like the existing hour cycle data, week data is presented per-region in the CLDR, rather than per-locale. The method to add likely subtags to a locale to perform region lookups is the same. The list of regions in the CLDR for hour cycle, minimum days, first day, and weekend days are quite different. So rather than changing the existing HourCycleRegion enum to a generic Region enum, we generate separate enums for each of the week data fields. This allows each lookup into these fields to remain simple array-based index access, without any "jumps" for regions that don't have CLDR data for a field.	2022-07-06 16:56:42 +02:00
DexesTTP	7ceeb74535	AK: Use an enum instead of a bool for String::replace(all_occurences) This commit has no behavior changes. In particular, this does not fix any of the wrong uses of the previous default parameter (which used to be 'false', meaning "only replace the first occurence in the string"). It simply replaces the default uses by String::replace(..., ReplaceMode::FirstOnly), leaving them incorrect.	2022-07-06 11:12:45 +02:00
Timothy Flynn	70ede2825e	LibUnicode: Use BCP 47 data to filter valid calendar names	2022-02-16 07:23:07 -05:00
Timothy Flynn	63c3437274	LibUnicode: Use BCP 47 data to generate available calendars and numbers BCP 47 will be the single source of truth for known calendar and number system keywords, and their aliases (e.g. "gregory" is an alias for "gregorian"). Move the generation of available keywords to where we parse the BCP 47 data, so that hard-coded aliases may be removed from other generators.	2022-02-16 07:23:07 -05:00
Timothy Flynn	ca3bcf201f	LibUnicode: Port the CLDR date format generator to the stream API	2022-02-14 11:39:46 -05:00
Timothy Flynn	6efbafa6e0	Everywhere: Update copyrights with my new serenityos.org e-mail :^)	2022-01-31 18:23:22 +00:00
Timothy Flynn	ebd33e580b	LibUnicode: Generate a list of available calendars	2022-01-31 00:32:41 +00:00
Timothy Flynn	589e7354fb	LibUnicode: Remove extraneous semi-colons at end of generator functions	2022-01-27 21:16:44 +00:00
Timothy Flynn	4400150cd2	LibJS+LibUnicode: Return the appropriate time zone name depending on DST	2022-01-19 21:20:41 +00:00
Timothy Flynn	bf677eb485	LibUnicode: Generate both standard and daylight time zone names While LibTimeZone didn't support DST, we only generated one of them, preferring the standard name. Now that DST can be tested, generate both names.	2022-01-19 21:20:41 +00:00
Timothy Flynn	bdf02c21e1	LibUnicode: Swap the preferred order of standard time zone display names Our generator is currently preferring the DST variant of the time zone display names over the non-DST variant. LibTimeZone currently does not have DST support, and operates in a mode that basically assumes DST does not exist. Swap the display names for now just to be consistent until we have DST support. Note we will need to generate both of these variants and select the appropriate one at runtime once we have DST support.	2022-01-12 15:43:12 +01:00
Timothy Flynn	e2dfbe8f67	LibUnicode: Parse and generate long and short generic time zone names This implements the CalendarPatternStyle::{Long,Short}Generic styles of time zone name formatting.	2022-01-11 23:56:35 +01:00
Timothy Flynn	8d35563f28	LibUnicode: Implement TR-35's localized GMT offset formatting This adds an API to use LibTimeZone to convert a time zone such as "America/New_York" to a GMT offset string like "GMT-5" (short form) or "GMT-05:00" (long form).	2022-01-11 23:56:35 +01:00
Timothy Flynn	b543c3e490	Meta: Don't assume how each generator wants to generate keyed map names The generate_mapping helper generates a series of structs like: Array<SomeType, 1> s_mapping_key_0 {}; Array<SomeType, 2> s_mapping_key_1 {}; Array<SomeType, 3> s_mapping_key_2 {}; Array<Span<SomeType const>> s_mapping { { s_mapping_key_0.span(), s_mapping_key_1.span(), s_mapping_key_2.span(), } }; Where the names of the struct were generated by the format_mapping_name lambda inside the helper. Rather than this lambda making assumptions on how each generator wants to name its structs, add a parameter for the caller to provide a naming formatter. This is because the TimeZoneData generator will want pretty specific identifier formatting rules.	2022-01-11 00:36:45 +01:00
Timothy Flynn	498b741434	LibUnicode: Use LibTimeZone's list of time zone names LibUnicode no longer needs to generate a list of time zone names that it parsed from metaZones.json. We can defer to the TZDB for a golden list of time zones.	2022-01-08 12:45:34 +01:00
Timothy Flynn	ca9123f66f	LibUnicode: Rename DateTimeFormat's generator's TimeZone struct Before using LibTimeZone within LibUnicode, rename this structure to avoid naming conflicts with the TimeZone namespace.	2022-01-08 12:45:34 +01:00
Timothy Flynn	6d7d9dd324	LibUnicode: Do not assume time zones & meta zones have a 1-to-1 mapping The generator parses metaZones.json to form a mapping of meta zones to time zones (AKA "golden zone" in TR-35). This parser errantly assumed this was a 1-to-1 mapping.	2022-01-06 22:28:01 +01:00
Timothy Flynn	62d8d1fdfd	LibUnicode: Move UTC verification to the scope that requires it In Unicode::get_time_zone_name(), we don't need to require that the time zone is UTC for long- and short-style name lookups. This is required for other styles, because they will depend on TZDB data - so move the VERIFY to that scope.	2022-01-06 22:28:01 +01:00
Timothy Flynn	ec7d5351ed	LibJS+LibUnicode: Handle flexible day periods that roll over midnight When searching for the locale-specific flexible day period for a given hour, we were neglecting to handle cases where the period crosses 00:00. For example, the en locale defines a day period range of [21:00, 06:00). When given the hour of 05:00, we were checking if (21 <= 5 && 5 < 6), thus not recognizing that the hour falls in that period.	2022-01-05 16:22:55 +01:00
Timothy Flynn	ba4cdf34f8	LibUnicode: Convert UnicodeDateTimeFormat to link with weak symbols	2022-01-04 22:49:43 +00:00
Timothy Flynn	126a3fe180	LibUnicode: Add minimal support for generic & offset-based time zones ECMA-402 now supports short-offset, long-offset, short-generic, and long-generic time zone name formatting. For example, in the en-US locale the America/Eastern time zone would be formatted as: short-offset: GMT-5 long-offset: GMT-05:00 short-generic: ET long-generic: Eastern Time We currently only support the UTC time zone, however. Therefore, this very minimal implementation does not consider GMT offset or generic display names. Instead, the CLDR defines specific strings for UTC.	2022-01-03 15:11:59 +01:00
Timothy Flynn	15e1498419	LibUnicode: Dynamically load the generated UnicodeDateTimeFormat symbols	2021-12-21 13:09:49 -08:00
Michel Hermier	060e5ccbbc	Lagom: Bind `time_zone_list_index_type` in the generator The variable `s_time_zone_list_index_type` seems to be unused (detected when compiling with clang), and it seems logical to bind it even it if it is not used for now.	2021-12-18 21:01:10 -08:00
Timothy Flynn	6e5f0b139b	LibUnicode: Remove unused fields from generated structures A couple of structures held a string index that is unused. Removing them also removes the string values from the unique string list.	2021-12-13 21:28:56 -08:00
Timothy Flynn	77fc877c04	LibUnicode: Generate unique lists of hour cycles	2021-12-13 21:28:56 -08:00
Timothy Flynn	6f17696176	LibUnicode: Generate unique lists of time zone structures	2021-12-13 21:28:56 -08:00
Timothy Flynn	df33156462	LibUnicode: Generate unique lists of day period structures	2021-12-13 21:28:56 -08:00
Timothy Flynn	265785e847	LibUnicode: Generate unique day period structures	2021-12-13 21:28:56 -08:00
Timothy Flynn	7af1818e76	LibUnicode: Generate unique time zone structures Each of the 374 locales contain 156 time zone structures. Of these 58,344 structures, 13,578 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	b14b37f386	LibUnicode: Generate unique calendar structures Of the 374 generated calendars, 173 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	4b721597d7	LibUnicode: Generate unique lists of calendar range patterns Of the 374 range pattern lists and 374 range12 pattern lists, 230 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	9fc2442e7d	LibUnicode: Generate unique lists of calendar patterns Of the 374 generated lists, 152 are unique. These lists have upwards of 1000 entries as well, so the de-duplication is particularly nice.	2021-12-13 21:28:56 -08:00
Timothy Flynn	09547f4084	LibUnicode: Generate unique lists of calendar symbols structures Of the 374 generated lists, 120 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	f681ec9d98	LibUnicode: Generate unique calendar symbols structures Each of the 374 generated calendars include 4 symbols structures. Of these 1496 structures, only 386 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	62ff029890	LibUnicode: Generate CalendarSymbols in a predetermined order Similar to commit `2a7f36b392`, this change moves the generated CalendarSymbol enumeration to the public LibUnicode/NumberFormat.h header with a pre-defined set of symbols that we need. This is to prepare for uniquely generating the CalendarSymbols structure.	2021-12-13 21:28:56 -08:00
Timothy Flynn	cf8ef954e5	LibUnicode: Generate unique lists of calendar symbols Each of the 374 generated calendars include 4 sets of symbols, each of which have 3 lists of symbols (narrow, short, long). Of these 4488 lists, only 819 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	af7caa97c8	LibUnicode: Generate unique calendar format structures There are currently 374 calendars generated, each of which include 3 CalendarFormat structures. Of these 1122 instances, only 167 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	a417c23de0	LibUnicode: Parse and generate per-locale day period ranges	2021-12-10 21:27:24 +00:00
Timothy Flynn	fa8e881cfa	LibUnicode: Parse and generate secondary day period symbols Generate morning2, afternoon2, evening2, and night2 symbols.	2021-12-10 21:27:24 +00:00
Timothy Flynn	76aab821f4	LibJS+LibUnicode: Rename some Unicode::DayPeriod values In the CLDR, there aren't "night" values, there are "night1" & "night2" values. This is for locales which use a different name for nighttime depending on the hour. For example, the ja locale uses "夜" between the hours of 19:00 and 23:00, and "夜中" between the hours of 23:00 and 04:00. Our CLDR parser is currently ignoring "night2", so this rename is to prepare for that. We could probably come up with better names, but in the end, the API in LibUnicode will be such that outside callers won't even see Night1, etc.	2021-12-10 21:27:24 +00:00
Timothy Flynn	9d4c4303fd	LibUnicode: Parse and generate date time range format patterns	2021-12-09 23:43:04 +00:00

1 2

68 commits