ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-22 15:40:19 +00:00

History

Timothy Flynn 7e6ad172a4 LibUnicode: Support code point names that apply to ranges of code points For example, consider the following adjacent entries in UnicodeData.txt: 3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;; 4DBF;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;; Our current implementation would assign the display name "CJK Ideograph Extension A" to code points U+3400 & U+4DBF, but not to the code points in between. Not only should those code points be assigned a name, but the Unicode spec also has formatting rules on what the names should be (the names for these ranged code points are not as they appear in UnicodeData.txt). The spec also defines names for code point ranges that actually are listed individually in UnicodeData.txt. For example: 2F800;CJK COMPATIBILITY IDEOGRAPH-2F800;Lo;0;L;4E3D;;;;N;;;;; 2F801;CJK COMPATIBILITY IDEOGRAPH-2F801;Lo;0;L;4E38;;;;N;;;;; 2F802;CJK COMPATIBILITY IDEOGRAPH-2F802;Lo;0;L;4E41;;;;N;;;;; Code points are only coalesced into a range if all fields after the name are equivalent. Our parser will insert the range and its name formatting pattern when it comes across the first code point in that range, then ignore other code points in that range. This reduces the number of names we generated by nearly 2,000.		2021-11-30 11:24:02 +01:00
..
CMakeLists.txt	Tests: Remove all file(GLOB) from CMakeLists in Tests	2021-09-02 09:08:23 +02:00
TestUnicodeCharacterTypes.cpp	LibUnicode: Support code point names that apply to ranges of code points	2021-11-30 11:24:02 +01:00
TestUnicodeLocale.cpp	LibUnicode: Support locales-without-script aliases for ECMA-402	2021-11-19 11:45:35 +01:00