beenull/ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-25 17:10:23 +00:00

Author	SHA1	Message	Date
Andreas Kling	ecccd511fa	Meta: Run QEMU with QMP socket This allows external connections to the QEMU monitor via QMP.	2021-12-11 20:13:36 +01:00
Timothy Flynn	1e95e7716b	LibUnicode: Generate unique units	2021-12-11 14:17:47 +00:00
Timothy Flynn	4c2c8b8e33	LibUnicode: Generate unique number systems	2021-12-11 14:17:47 +00:00
Timothy Flynn	2a7f36b392	LibJS+LibUnicode: Generate unique numeric symbol lists There are 443 number system objects generated, each of which held an array of number system symbols. Of those 443 arrays, only 39 are unique. To uniquely store these, this change moves the generated NumericSymbol enumeration to the public LibUnicode/NumberFormat.h header with a pre- defined set of symbols that we need. This is to ensure the generated, unique arrays are created in a known order with known symbols. While it is unfortunate to no longer discover these symbols at generation time, it does allow us to ignore unwanted symbols and perform less string-to- enumeration conversions at lookup time.	2021-12-11 14:17:47 +00:00
Timothy Flynn	9cc323b0b0	LibUnicode: Generate unique NumberFormat lists for each Unit	2021-12-11 14:17:47 +00:00
Timothy Flynn	cdbfe01827	LibUnicode: Generate unique NumberFormat lists for each NumberSystem	2021-12-11 14:17:47 +00:00
Timothy Flynn	76af9fae63	LibUnicode: Support storing lists in UniqueStorage for code generators The evolution of UniqueStorage has been as follows: 1. It was created as UniqueStringStorage to ensure only one copy of each unique string is generated. Interested parties stored an index into a unique string list, rather than the string itself. Commits: `f9e605397c` and `04e6b43f05` 2. It became apparent that non-string structures could also be de- duplicated to reduce the size of libunicode.so. UniqueStringStorage was generalized to UniqueStorage for this purpose. Commit: `d8e6beb14f` It's now also apparent that there's heavy duplication of lists of structures. For example, the NumberFormat generator stores 4 lists of NumberFormat objects. In total, we currently generate nearly 2,000 lists of these objects, of which 275 are unique. This change updates UniqueStorage to support storing lists. The only change is how the storage is generated - we generate each stored list individually, then an array storing spans of those lists.	2021-12-11 14:17:47 +00:00
Timothy Flynn	a417c23de0	LibUnicode: Parse and generate per-locale day period ranges	2021-12-10 21:27:24 +00:00
Timothy Flynn	fa8e881cfa	LibUnicode: Parse and generate secondary day period symbols Generate morning2, afternoon2, evening2, and night2 symbols.	2021-12-10 21:27:24 +00:00
Timothy Flynn	76aab821f4	LibJS+LibUnicode: Rename some Unicode::DayPeriod values In the CLDR, there aren't "night" values, there are "night1" & "night2" values. This is for locales which use a different name for nighttime depending on the hour. For example, the ja locale uses "夜" between the hours of 19:00 and 23:00, and "夜中" between the hours of 23:00 and 04:00. Our CLDR parser is currently ignoring "night2", so this rename is to prepare for that. We could probably come up with better names, but in the end, the API in LibUnicode will be such that outside callers won't even see Night1, etc.	2021-12-10 21:27:24 +00:00
Timothy Flynn	9d4c4303fd	LibUnicode: Parse and generate date time range format patterns	2021-12-09 23:43:04 +00:00
Timothy Flynn	fe84a365c2	LibUnicode: Parse and generate format pattern skeletons Pattern skeletons are more or less the "key" of format patterns. Every format pattern is assigned a skeleton. Interval patterns (which are not yet parsed) are also assigned a skeleton - this is used to match them to an "owning" format pattern. So we will use the skeleton generated here to match format patterns at runtime with their available interval patterns. An alternative approach would be to append interval patterns directly to their owning format pattern, but this has some draw backs: 1. Skeletons aren't totally unique. A skeleton may appear in both the "dateFormats" and "availableFormats" objects, in which case the same interval formats would be generated more than once. 2. Otherwise unique format patterns may only differ by the interval patterns assigned to them. This would cause the UniqueStorage for the format patterns to increase in size, impacting both compile times and libunicode.so size.	2021-12-09 23:43:04 +00:00
Timothy Flynn	b17c6ab661	LibUnicode: Fix typo in format pattern parser See: https://unicode.org/reports/tr35/tr35-dates.html#dfst-day	2021-12-09 23:43:04 +00:00
Sam Atkins	c9062b4ed5	LibWeb: Remove now-unused CustomStyleValue	2021-12-09 21:30:31 +01:00
Timothy Flynn	b76e44f66f	LibUnicode: Parse and generate time zone names in long and short form	2021-12-08 11:29:36 +00:00
Timothy Flynn	2bbf8aa24c	LibUnicode: Generate era, month, weekday and day period calendar symbols The parsing in parse_calendar_symbols() might be a bit more verbose than it really needs to be, but it is to ensure the symbols are generated in a known order that we can control with enumerations.	2021-12-08 11:29:36 +00:00
Timothy Flynn	9f7c727720	LibJS+LibUnicode: Generate missing patterns with fractionalSecondDigits TR-35's Matching Skeleton algorithm dictates how user requests including fractional second digits should be handled when the CLDR format pattern does not include that field. When the format pattern contains {second}, but does not contain {fractionalSecondDigits}, generate a second pattern which appends "{decimal}{fractionalSecondDigits}" to the {second} field.	2021-12-08 11:29:36 +00:00
Timothy Flynn	6ace4000bf	LibJS+LibUnicode: Supply field type in CalendarPattern's for-each method Some callers will want different behavior depending on what field is being provided to the callback.	2021-12-08 11:29:36 +00:00
Timothy Flynn	80ea6e664d	LibUnicode: Do not set day period format length for {ampm} segments TR-35 does define lengths for {ampm}, but they are unused by ECMA-402. To the contrary, defining the day_period length for this segment will prevent BasicFormatMatcher from ever selecting a pattern that contains this segment. Instead, ECMA-402 will only use the short length for {ampm} segments.	2021-12-08 11:29:36 +00:00
Timothy Flynn	dfe8d02482	LibUnicode: Generate missing format patterns TR-35 describes how to combine date, time, and available formats with date-time format patterns to generate more available format patterns: https://unicode.org/reports/tr35/tr35-dates.html#Missing_Skeleton_Fields Use these steps to generate ~400 new patterns for each calendar. These are required for ECMA-402's BasicFormatMatcher to produce reasonable results.	2021-12-06 15:46:34 +01:00
Timothy Flynn	439b06bf0f	LibUnicode: Fully parse date-time formatting patterns Similar to NumberFormat, replace the segments of date-time patterns with partitions that can be split at runtime. Also generate the pattern style fields for e.g. era, day, hour, etc.	2021-12-06 15:46:34 +01:00
Timothy Flynn	2772606527	LibUnicode: Generate unique calendar pattern structures Add unique storage for parsed CalendarPattern structures to ensure only one copy of each structure is generated. This doesn't have any impact on libunicode.so with the current generated data. Rather, this prevents the amount of generated data from needlessly growing astronomically once date-time patterns are fully parsed. There will be 173,459 patterns parsed, of which only 22,495 (about 12%) are unique. This change will save a few MB, and will also help compilation times.	2021-12-06 15:46:34 +01:00
Timothy Flynn	1d735105c3	LibUnicode: Generate per-locale, per-calendar formats out of line Currently, there's only a handful of entries in these arrays, so it is not a huge deal to generate them inline with the struct that holds them. But they will each soon contain a few hundred entries. Generate them out of line for easier viewing in the generated code.	2021-12-06 15:46:34 +01:00
Timothy Flynn	945ca81dd7	LibUnicode: Generate unique number format structures Add unique storage for parsed NumberFormat structures to ensure only one copy of each structure is generated. Reduces libunicode.so on x86 from 13.2 MB to 11.4 MB.	2021-12-06 15:46:34 +01:00
Timothy Flynn	d8e6beb14f	LibUnicode: Generalize the generators' unique string storage UniqueStringStorage is used to ensure only one copy of a string will be generated, and interested parties store just an index into the generated storage. Generalize this class to allow any* type to be stored uniquely. * To actually be storable, the type must have both an AK::Format and an AK::Traits overload available.	2021-12-06 15:46:34 +01:00
Sam Atkins	16e5f24e64	Fuzzers: Cast unused smart-pointer return values to void	2021-12-05 15:31:03 +01:00
Sam Atkins	f3d8f80e9c	IPCCompiler: Cast return value of synchronous void IPC calls to void The synchronous call returns a NonnullOwnPtr that we don't use, so we have to cast to prevent a compiler warning once smart pointers become [[nodiscard]].	2021-12-05 15:31:03 +01:00
Idan Horowitz	a0e2fedc20	Kernel: Stub out the SO_DEBUG SOL_SOCKET-level option	2021-12-05 12:53:29 +01:00
Timothy Flynn	bf79c73158	LibUnicode: Do not generate data for "generic" calendars This is not a calendar supported by ECMA-402, so let's not waste space with its data. Further, don't generate "gregorian" as a valid Unicode locale extension keyword. It's an invalid type identifier, thus cannot be used in locales such as "en-u-ca-gregorian".	2021-12-01 16:36:26 +00:00
Timothy Flynn	7e6ad172a4	LibUnicode: Support code point names that apply to ranges of code points For example, consider the following adjacent entries in UnicodeData.txt: 3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;; 4DBF;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;; Our current implementation would assign the display name "CJK Ideograph Extension A" to code points U+3400 & U+4DBF, but not to the code points in between. Not only should those code points be assigned a name, but the Unicode spec also has formatting rules on what the names should be (the names for these ranged code points are not as they appear in UnicodeData.txt). The spec also defines names for code point ranges that actually are listed individually in UnicodeData.txt. For example: 2F800;CJK COMPATIBILITY IDEOGRAPH-2F800;Lo;0;L;4E3D;;;;N;;;;; 2F801;CJK COMPATIBILITY IDEOGRAPH-2F801;Lo;0;L;4E38;;;;N;;;;; 2F802;CJK COMPATIBILITY IDEOGRAPH-2F802;Lo;0;L;4E41;;;;N;;;;; Code points are only coalesced into a range if all fields after the name are equivalent. Our parser will insert the range and its name formatting pattern when it comes across the first code point in that range, then ignore other code points in that range. This reduces the number of names we generated by nearly 2,000.	2021-11-30 11:24:02 +01:00
Timothy Flynn	f2f4980f15	LibUnicode: Remove unused field from UnicodeData generator	2021-11-30 11:24:02 +01:00
Timothy Flynn	71903ea7e1	LibUnicode: Parse and generate calendar (ca) Unicode keywords Also removes a few fly-by "StringView x = nullptr;" unnecessary initializers.	2021-11-29 22:48:46 +00:00
Timothy Flynn	48ce72e472	LibUnicode: Parse and generate regional hour cycles Unlike most data in the CLDR, hour cycles are not stored on a per-locale basis. Instead, they are keyed by a string that is usually a region, but sometimes is a locale. Therefore, given a locale, to determine the hour cycles for that locale, we: 1. Check if the locale itself is assigned hour cycles. 2. If the locale has a region, check if that region is assigned hour cycles. 3. Otherwise, maximize that locale, and if the maximized locale has a region, check if that region is assigned hour cycles. 4. If the above all fail, fallback to the "001" region. Further, each locale's default hour cycle is the first assigned hour cycle.	2021-11-29 22:48:46 +00:00
Timothy Flynn	15fc03ef34	LibUnicode: Sort generated enums case-insensitively This hasn't mattered yet by chance, because the source for all enums contains names of the same case. But the enum generated for hour cycle regions will have mixed case. Sort them case-insensitively in order to traverse these names in the same order in both generate_enum and generate_mapping.	2021-11-29 22:48:46 +00:00
Timothy Flynn	7872934861	LibUnicode: Parse and generate available candidate format patterns These formats are used by ECMA-402 when neither a date nor time style is specified. In that case, these patterns are searched for a best match.	2021-11-29 22:48:46 +00:00
Timothy Flynn	287d43f4be	LibUnicode: Hard-code an alias from the Gregorian calendar to Gregory This alias exists because the name "Gregorian" is too long to be used in a locale identifier, i.e. "en-u-ca-gregorian" is invalid. Aliases for calendars are defined here: https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-bcp47/bcp47/calendar.json However, CLDR version 40 neglected to actually include the cldr-bcp47 package in its release, so we don't have access to this data. So for now hard-code this alias so that JavaScript can actually access it. See: https://unicode-org.atlassian.net/browse/CLDR-15158	2021-11-29 22:48:46 +00:00
Timothy Flynn	f471ecdbe9	LibUnicode: Parse and generate date, time, and date-time format patterns	2021-11-29 22:48:46 +00:00
Timothy Flynn	5c57341672	LibUnicode: Create a nearly empty generator for date-time formatting Similar to number formatting, the data for date-time formatting will be located in its own generated file. This extracts the cldr-dates package from the CLDR and sets up the generator plumbing to create the date-time data files.	2021-11-29 22:48:46 +00:00
Timothy Flynn	914675e826	LibJS+LibUnicode: Separate number formatting methods from Locale.h Currently, we generate separate data files for locale and number format related tables/methods, but provide public accessors for all of the data in one Locale.h file. Rather than continuing this trend for date-time, relative time, etc. formatting, it's a bit easier to reason about if the public accessors are also in separate files.	2021-11-29 22:48:46 +00:00
Hendiadyoin1	7a27ecc135	Tests: Add a simple LibGL render-test At the moment we just check if we can render a simple triangle, we do not yet actually test if the image is indeed the triangle we wanted. This test also outputs the rendered image when GL_DEBUG is enabled to a file called "picture.bmp" for manual verification. Co-authored-by: sunverwerth <s.unverwerth@serenityos.org>	2021-11-29 23:17:05 +03:30
Hendiadyoin1	3a4dd5ff87	Lagom: Add LibGL to the libraries	2021-11-29 23:17:05 +03:30
Hendiadyoin1	849089c406	Lagom: Disable implicit-const-int-float-conversion warnings	2021-11-29 23:17:05 +03:30
Andreas Kling	cb9cac4e40	LibIPC+IPCCompiler+AK: Make IPC value decoders return ErrorOr<void> This allows us to use TRY() in decoding helpers, leading to a nice reduction in line count.	2021-11-28 23:14:19 +01:00
Andreas Kling	8d76eb773f	LibIPC: Make IPC::Connection::post_message() return ErrorOr	2021-11-28 23:14:18 +01:00
kleines Filmröllchen	96d02a3e75	LibAudio: New error propagation API in Loader and Buffer Previously, a libc-like out-of-line error information was used in the loader and its plugins. Now, all functions that may fail to do their job return some sort of Result. The universally-used error type ist the new LoaderError, which can contain information about the general error category (such as file format, I/O, unimplemented features), an error description, and location information, such as file index or sample index. Additionally, the loader plugins try to do as little work as possible in their constructors. Right after being constructed, a user should call initialize() and check the errors returned from there. (This is done transparently by Loader itself.) If a constructor caused an error, the call to initialize should check and return it immediately. This opportunity was used to rework a lot of the internal error propagation in both loader classes, especially FlacLoader. Therefore, a couple of other refactorings may have sneaked in as well. The adoption of LibAudio users is minimal. Piano's adoption is not important, as the code will receive major refactoring in the near future anyways. SoundPlayer's adoption is also less important, as changes to refactor it are in the works as well. aplay's adoption is the best and may serve as an example for other users. It also includes new buffering behavior. Buffer also gets some attention, making it OOM-safe and thereby also propagating its errors to the user.	2021-11-28 13:33:51 -08:00
Ben Wiederhake	7ba7668fbb	Meta: Allow overlong 'fixup!' commit titles in pre-commit hook	2021-11-28 11:49:13 -08:00
Daniel Bertalan	f29f9762a2	Meta: Copy libstdc++ into the disk image With this, we can now compile C++ programs with the LLVM port without having to jump through hooks to build libc++ because it can't be cross-compiled with our GNU toolchain.	2021-11-28 09:38:57 -08:00
Daniel Bertalan	c4707ed0d9	Meta: Copy libc++ headers into the disk image If we do this, the LLVM port's Clang will pick up these paths, so we won't have to compile libc++ twice. This does increase the size of _disk_image by 5 MB, but that shouldn't be a problem.	2021-11-28 09:38:57 -08:00
Jelle Raaijmakers	689ad0752c	Kernel: Add AC97_DEBUG macro	2021-11-28 19:26:22 +02:00
Itamar	58746a08a1	CMake: Also install the source files of userland programs Previously, we only copied the source files of libraries to `/usr/src/serenity`. We now also install the source files of userland programs.	2021-11-26 11:17:11 -08:00
Itamar	3b5eeb7fdd	CMake: Simplify serenity_install_sources by inferring installation path The serenity_install_sources function now infers the path under `/usr/src/serenity` in which to install the source files according to the relative path of the source files in the repository. For example `Userland/Libraries/LibGUI/Widget.h` gets installed at `/usr/src/serenity/Userland/Libraries/LibGUI/Widget.h`. This fixes cases where the source files of libraries are not under `Userland/Libraries` (for example LibShell & LibLanguageServer).	2021-11-26 11:17:11 -08:00
Timothy Flynn	0aa3e5c2ea	LibUnicode: Port generator utility methods to ErrorOr Most of these were VERIFY-ing for success, but propagating an error message up to serenity_main() is much nicer than just a SIGABRT.	2021-11-23 22:58:05 +01:00
Timothy Flynn	55e0b91d8d	LibUnicode: Port GenerateUnicodeNumberFormat to ErrorOr and LibMain	2021-11-23 22:58:05 +01:00
Timothy Flynn	8c5f19f7c8	LibUnicode: Port GenerateUnicodeLocale to ErrorOr and LibMain	2021-11-23 22:58:05 +01:00
Timothy Flynn	88dbf3c348	LibUnicode: Port GenerateUnicodeData to ErrorOr and LibMain Also store command line arguments as StringViews rather than pointers.	2021-11-23 22:58:05 +01:00
Timothy Flynn	4c4b752ab8	Meta: Allow lagom_tool invocations to specify libraries to link	2021-11-23 22:58:05 +01:00
Timothy Flynn	a2ea704d21	Meta: Define LagomMain outside of the BUILD_LAGOM branch This allows code generators to use LagomMain. Otherwise, during CI, they are built during the superbuild without BUILD_LAGOM=ON.	2021-11-23 22:58:05 +01:00
Timothy Flynn	1539ed12f1	LibUnicode: Functionalize the Unicode generator CMake commands Makes it a bit easier to add a new generator.	2021-11-23 22:58:05 +01:00
Timothy Flynn	0e80c1ee6b	LibUnicode: Invoke lagom_tool() with SOURCES inline	2021-11-23 22:58:05 +01:00
Jelle Raaijmakers	69a7ffa174	Meta: Increase PulseAudio timer period to 2ms This seems to prevent crackling audio when starting up Qemu whenever there is audio already playing.	2021-11-23 10:35:00 +01:00
Jelle Raaijmakers	9d8a566d83	Meta: Use 1ms timer period for Qemu Pulse Audio backend The default seems to be 10ms and can result in a lot of crackling noises in the output. A value of 1ms works well on my machine.	2021-11-23 10:06:24 +01:00
Jelle Raaijmakers	03329bbfa5	Meta: Use AC97 device in Qemu by default	2021-11-23 10:06:24 +01:00
Linus Groh	cfecfbb214	js: Port to LibMain :^) This wasn't particularly difficult, and there's not much use for the nicer interface yet either. While unveil() is of limited use in js(1) as it should be able to open arbitrary files, I feel like we should be able to add a pledge() call.	2021-11-22 23:07:43 +01:00
Linus Groh	ba0f89a4d1	Lagom: Add LibMain as a lagom_lib()	2021-11-22 23:07:43 +01:00
Andreas Kling	5a79c69b02	LibGfx: Make ImageDecoderPlugin::frame() return ErrorOr<> This is a first step towards better error propagation from image codecs.	2021-11-21 20:22:48 +01:00
Ben Wiederhake	b06b54772e	Meta+LibUnicode: Provide code point names through library	2021-11-20 00:31:55 +01:00
Timothy Flynn	93ee922027	LibUnicode: Support locales-without-script aliases for ECMA-402 As noted by ECMA-402, if a supported locale contains all of a language, script, and region subtag, then the implementation must also support the locale without the script subtag. The most complicated example of this is the zh-TW locale. The list of locales in the CLDR database does not include zh-TW or its maximized zh-Hant-TW variant. Instead, it inlcudes the zh-Hant locale. However, zh-Hant-TW is listed in the default-content locale list in the cldr-core package. This defines an alias from zh-Hant-TW to zh-Hant. We must then also support the zh-Hant-TW alias without the script subtag: zh-TW. This transitively maps zh-TW to zh-Hant, which is a case quite heavily tested by test262.	2021-11-19 11:45:35 +01:00
Timothy Flynn	4b535ce1c8	LibUnicode: Stop passing the cldr-core package to UnicodeNumberFormat This is no longer needed now that this generator isn't parsing the default-content locales.	2021-11-19 11:45:35 +01:00
Timothy Flynn	a13fa15a30	LibUnicode: Generate default-content locales as aliases Previously, we were just copying the locale data into default-content locales (for example, copying the "en" data into "en-US"). Instead, we can just define the default-content locales as aliases to their main locales.	2021-11-19 11:45:35 +01:00
Timothy Flynn	9d1519e21c	LibUnicode: Move GenerateUnicodeData's Alias struct to generator header This will be used for locale aliases as well. Also rename the "property" field in this struct to "name", as it no longer is only used for property aliases.	2021-11-19 11:45:35 +01:00
Andreas Kling	2b866e3c9b	LibGfx: Remove ImageDecoderPlugin::bitmap() in favor of frame(index) To encourage proper support for multi-frame images throughout the system, get rid of the single-frame convenience bitmap() API.	2021-11-18 21:11:30 +01:00
Andreas Kling	750f1d580a	Fuzzers: Use ImageDecoderPlugin::frame() in image decoder fuzzers Let's work towards getting rid of the first-frame-only bitmap() API.	2021-11-18 21:11:30 +01:00
Andreas Kling	587f9af960	AK: Make JSON parser return ErrorOr<JsonValue> (instead of Optional) Also add slightly richer parse errors now that we can include a string literal with returned errors. This will allow us to use TRY() when working with JSON data.	2021-11-17 00:21:10 +01:00
Timothy Flynn	cafb717486	LibUnicode: Parse and generate CLDR unit data for Intl.NumberFormat The units data is in another CLDR package, cldr-units.	2021-11-16 23:14:09 +00:00
Timothy Flynn	c24a350a18	LibUnicode: Ignore U+200F when parsing format identifiers Noticed this while implementing multiple identifier support. We were errantly parsing U+200F as a lone identifier in some Hebrew formats.	2021-11-16 23:14:09 +00:00
Timothy Flynn	04b8b87c17	LibJS+LibUnicode: Support multiple identifiers within format pattern This wasn't the case for compact patterns, but unit patterns can contain multiple (up to 2, really) identifiers that must each be recognized by LibJS. Each generated NumberFormat object now stores an array of identifiers parsed. The format pattern itself is encoded with the index into this array for that identifier, e.g. the compact format string "0K" will become "{number}{compactIdentifier:0}".	2021-11-16 23:14:09 +00:00
Timothy Flynn	3b68370212	LibJS+LibUnicode: Rename the generated compact_identifier to identifier This field is currently used to store the StringView into the compact name/symbol in the format string. Units will need to store a similar field, so rename the field to be more generic, and extract the parser for it.	2021-11-16 23:14:09 +00:00
Timothy Flynn	1f546476d5	LibJS+LibUnicode: Fix computation of compact pattern exponents The compact scale of each formatting rule was precomputed in commit: `be69eae651` Using the formula: compact scale = magnitude - pattern scale This computation was off-by-one. For example, consider the format key "10000-count-one", which maps to "00 thousand" in en-US. What we are really after is the exponent that best represents the string "thousand" for values greater than 10000 and less than 100000 (the next format key). We were previously doing: log10(10000) - "00 thousand".count("0") = 2 Which clearly isn't what we want. Instead, if we do: log10(10000) + 1 - "00 thousand".count("0") = 3 We get the correct exponent for each format key for each locale. This commit also renames the generated variable from "compact_scale" to "exponent" to match the terminology used in ECMA-402.	2021-11-16 00:56:55 +00:00
Timothy Flynn	48d5684780	LibUnicode: Parse compact identifiers and replace them with a format key For example, in en-US, the decimal, long compact pattern for numbers between 10,000 and 100,000 is "00 thousand". In that pattern, "thousand" is the compact identifier, and the generated format pattern is now "{number} {compactIdentifier}". This also generates that identifier as its own field in the NumberFormat structure.	2021-11-16 00:56:55 +00:00
Timothy Flynn	30fbb7d9cd	LibUnicode: Parse and generate scientific formatting rules	2021-11-14 17:00:35 +00:00
Timothy Flynn	3645f6a0fc	LibUnicode: Fix typo in percent format parser Just by sheer luck this had no actual effect because the decimal format prefix has the same length as the percent format prefix.	2021-11-14 17:00:35 +00:00
Timothy Flynn	3b7f5af042	LibUnicode: Generate primary and secondary number grouping sizes Most locales have a single grouping size (the number of integer digits to be written before inserting a grouping separator). However some have a primary and secondary size. We parse the primary size as the size used for the least significant integer digits, and the secondary size for the most significant.	2021-11-14 10:35:19 +00:00
Timothy Flynn	c65dea64bd	LibJS+LibUnicode: Don't remove {currency} keys in GetNumberFormatPattern In order to implement Intl.NumberFormat.prototype.formatToParts, do not replace {currency} keys in the format pattern before ECMA-402 tells us to. Otherwise, the array return by formatToParts will not contain the expected currency key. Early replacement was done to avoid resolving the currency display more than once, as it involves a couple of round trips to search through LibUnicode data. So this adds a non-standard method to NumberFormat to do this resolution and cache the result. Another side effect of this change is that LibUnicode must replace unit format patterns of the form "{0} {1}" during code generation. These were previously skipped during code generation because LibJS would just replace the keys with the currency display at runtime. But now that the currency display injection is delayed, any {0} or {1} keys in the format pattern will cause PartitionNumberPattern to abort.	2021-11-13 19:01:25 +00:00
Timothy Flynn	a701ed52fc	LibJS+LibUnicode: Fully implement currency number formatting Currencies are a bit strange; the layout of currency data in the CLDR is not particularly compatible with what ECMA-402 expects. For example, the currency format in the "en" and "ar" locales for the Latin script are: en: "¤#,##0.00" ar: "¤\u00A0#,##0.00" Note how the "ar" locale has a non-breaking space after the currency symbol (¤), but "en" does not. This does not mean that this space will appear in the "ar"-formatted string, nor does it mean that a space won't appear in the "en"-formatted string. This is a runtime decision based on the currency display chosen by the user ("$" vs. "USD" vs. "US dollar") and other rules in the Unicode TR-35 spec. ECMA-402 shies away from the nuances here with "implementation-defined" steps. LibUnicode will store the data parsed from the CLDR however it is presented; making decisions about spacing, etc. will occur at runtime based on user input.	2021-11-13 11:52:45 +00:00
Timothy Flynn	e9493a2cd5	LibUnicode: Ensure UnicodeNumberFormat is aware of default content For example, there isn't a unique set of data for the en-US locale; rather, it defaults to the data for the en locale. See this commit for much more detail: `357c97dfa8`	2021-11-13 11:52:45 +00:00
Timothy Flynn	9421d5c0cf	LibUnicode: Generate currency unit-pattern number formats These are used when formatting a number as currency with a display option of "name" (e.g. for USD, the name is "US Dollars" in en-US). These patterns appear in the CLDR in a different manner than other number formats that are pluralized. They are of the form "{0} {1}", therefore do not undergo subpattern replacements.	2021-11-13 11:52:45 +00:00
Timothy Flynn	39e031c4dd	LibJS+LibUnicode: Generate all styles of currency localizations Currently, LibUnicode is only parsing and generating the "long" style of currency display names. However, the CLDR contains "short" and "narrow" forms as well that need to be handled. Parse these, and update LibJS to actually respect the "style" option provided by the user for displaying currencies with Intl.DisplayNames. Note: There are some discrepencies between the engines on how style is handled. In particular, running: new Intl.DisplayNames('en', {type:'currency', style:'narrow'}).of('usd') Gives: SpiderMoney: "USD" V8: "US Dollar" LibJS: "$" And running: new Intl.DisplayNames('en', {type:'currency', style:'short'}).of('usd') Gives: SpiderMonkey: "$" V8: "US Dollar" LibJS: "$" My best guess is V8 isn't handling style, and just returning the long form (which is what LibJS did before this commit). And SpiderMoney can handle some styles, but if they don't have a value for the requested style, they fall back to the canonicalized code passed into of().	2021-11-13 11:52:45 +00:00
Timothy Flynn	6cfd63e5bd	LibUnicode: Parse numbers in number formats a bit more leniently The parser was previously expecting number sections within a pattern to start with "#", but they may also begin with "0".	2021-11-13 11:52:45 +00:00
Daniel Bertalan	fe1726521a	Meta: Resolve cyclic dependency between LibPthread and libc++ libc++ uses a Pthread condition variable in one of its initialization functions. This means that Pthread forwarding has to be set up in LibC before libc++ can be initialized. Also, because LibPthread is written in C++, (at least some) parts of the C++ standard library have to be linked against it. This is a circular dependency, which means that the order in which these two libraries' initialization functions are called is undefined. In some cases, libc++ will come first, which will then trigger an assert due to the missing Pthread forwarding. This issue isn't necessarily unique to LibPthread, as all libraries that libc++ depends on exhibit the same circular dependency issue. The reason why this issue didn't affect the GNU toolchain is that libstdc++ is always linked statically. If we were to change that, I believe that we would run into the same issue.	2021-11-13 11:15:33 +00:00
Andreas Kling	b189c88ec2	Fuzzers: Use ImageDecoders instead of load_FORMAT_from_memory() wrappers	2021-11-13 00:55:07 +01:00
Timothy Flynn	1f2ac0ab41	LibUnicode: Move number formatting code generator to UnicodeNumberFormat	2021-11-12 20:46:38 +00:00
Timothy Flynn	04e6b43f05	LibUnicode: Move (soon-to-be) common code out of GenerateUnicodeLocale The data used for number formatting is going to grow quite a bit when the cldr-units package is parsed. To prevent the generated UnicodeLocale file from growing outrageously large, the number formatting data can go into its own file. To prepare for this, move code that will be common between the generators for UnicodeLocale and UnicodeNumberFormat to the utility header.	2021-11-12 20:46:38 +00:00
Ali Mohammad Pur	c08bfd450b	Meta: Update the gdb script for the new RefPtr layout	2021-11-12 13:01:59 +00:00
Timothy Flynn	be69eae651	LibUnicode: Precompute the compact scale of each number formatting rule This will be needed for the ComputeExponentForMagnitude AO for compact formatting, namely step 5b: Let exponent be an implementation- and locale-dependent (ILD) integer by which to scale a number of the given magnitude in compact notation for the current locale.	2021-11-12 09:17:08 +00:00
Timothy Flynn	230b133ee3	LibUnicode: Parse number formats into zero/positive/negative patterns A number formatting pattern in the CLDR contains one or two entries, delimited by a semi-colon. Previously, LibUnicode was just storing the entire pattern as one string. This changes the generator to split the pattern on that delimiter and generate the 3 unique patterns expected by ECMA-402. The rules for generating the 3 patterns are as follows: * If the pattern contains 1 entry, it is the zero pattern. The positive pattern is the zero pattern prepended with {plusSign}. The negative pattern is the zero pattern prepended with {minusSign}. * If the pattern contains 2 entries, the first is the zero pattern, and the second is the negative pattern. The positive pattern is the zero pattern prepended with {plusSign}.	2021-11-12 09:17:08 +00:00
Timothy Flynn	1244ebcd4f	LibUnicode: Parse and generate standard accounting formatting rules Also known as "currency-accounting" in some CLDR documentation.	2021-11-12 09:17:08 +00:00
Timothy Flynn	967afc1b84	LibUnicode: Parse and generate standard currency formatting rules	2021-11-12 09:17:08 +00:00
Timothy Flynn	bffd73e0d4	LibUnicode: Parse and generate standard decimal formatting rules	2021-11-12 09:17:08 +00:00
Timothy Flynn	feb8c22a62	LibUnicode: Parse and generate standard percentage formatting rules	2021-11-12 09:17:08 +00:00
Timothy Flynn	4317a1b552	LibUnicode: Parse and generate compact currency formatting rules	2021-11-12 09:17:08 +00:00

1 2 3 4 5 ...

1023 commits