mirror of
https://github.com/LadybirdBrowser/ladybird.git
synced 2024-11-27 01:50:24 +00:00
0652cc48c0
We currently produce a single table for all categories of code point properties (GeneralCategory, Script, etc.). Each row contains a field indicating the range of code points to which that property applies. At runtime, we then do a binary search through that table to decide if a code point has a property. This changes our approach to generate a 2-stage lookup table for each of those categories. There is an in-depth explanation of these tables above the new `create_code_point_tables` method. The end effect is that code point property lookup is reduced from a binary search to constant-time array lookups. In total, this change: * Increases the size of libunicode.so from 2.7 MB to 2.9 MB. * Reduces the runtime of the new benchmark test case added here from 3.576s to 1.020s (a 3.5x speedup). * In a profile of resizing a TextEditor window with a 3MB file open, the runtime of checking if a code point has a word break property reduces from ~81% to ~56%. |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
TestEmoji.cpp | ||
TestSegmentation.cpp | ||
TestUnicodeCharacterTypes.cpp | ||
TestUnicodeNormalization.cpp |