Commit graph

22 commits

Author SHA1 Message Date
Tim Schumacher
8032724574 CodeGenerators: Ensure that we always print the entire generated output 2023-03-13 15:16:20 +00:00
Tim Schumacher
d5871f5717 AK: Rename Stream::{read,write} to Stream::{read_some,write_some}
Similar to POSIX read, the basic read and write functions of AK::Stream
do not have a lower limit of how much data they read or write (apart
from "none at all").

Rename the functions to "read some [data]" and "write some [data]" (with
"data" being omitted, since everything here is reading and writing data)
to make them sufficiently distinct from the functions that ensure to
use the entire buffer (which should be the go-to function for most
usages).

No functional changes, just a lot of new FIXMEs.
2023-03-13 15:16:20 +00:00
Timothy Flynn
ca2b030336 LibUnicode: Use binary search for lookups into the generated emoji data
This sorts the array of generated emoji data by code point (first by
code point length, then by code point value). This lets us use a binary
search to find emoji data, rather than the current linear search.

In a profile of scrolling around /home/anon/Documents/emoji.txt, this
reduces the runtime of Gfx::Emoji::emoji_for_code_points from 69.03% to
28.42%. Within that, Unicode::find_emoji_for_code_points reduces from
28.42% to just 1.95%.
2023-03-05 16:44:20 +01:00
Timothy Flynn
03f32bdf86 LibUnicode: Validate that all emoji images in /res/emoji actually exist
This will raise a compile error if an emoji image was neglected to be
added to e.g. emoji-serenity.txt, or if the code points are not correct.
2023-03-03 17:09:58 +00:00
Timothy Flynn
fd1fbad1d2 LibGfx+LibUnicode: Support specifying the path to search for emoji
Similar to the FontDatabase, this will be needed for Ladybird to find
emoji images. We now generate just the file name of emoji image in
LibUnicode, and look for that file in the specified path (defaulting to
/res/emoji) at runtime.
2023-03-01 14:54:16 +00:00
MacDue
01fa3bb788 LibUnicode: Propagate try_append() errors when building emoji data 2023-02-24 22:18:25 +01:00
Timothy Flynn
8c38d46c1a LibUnicode: Generate the path to emoji images alongside emoji data
This will provide for quicker emoji lookups, rather than having to
discover and allocate these paths at runtime before we find out if they
even exist.
2023-02-24 19:48:47 +01:00
Tim Schumacher
874c7bba28 LibCore: Remove Stream.h 2023-02-13 00:50:07 +00:00
Tim Schumacher
606a3982f3 LibCore: Move Stream-based file into the Core namespace 2023-02-13 00:50:07 +00:00
Tim Schumacher
d43a7eae54 LibCore: Rename File to DeprecatedFile
As usual, this removes many unused includes and moves used includes
further down the chain.
2023-02-13 00:50:07 +00:00
MacDue
63b11030f0 Everywhere: Use ReadonlySpan<T> instead of Span<T const> 2023-02-08 19:15:45 +00:00
Linus Groh
6e7459322d AK: Remove StringBuilder::build() in favor of to_deprecated_string()
Having an alias function that only wraps another one is silly, and
keeping the more obvious name should flush out more uses of deprecated
strings.
No behavior change.
2023-01-27 20:38:49 +00:00
Tim Schumacher
2fc2025f49 LibCore: Move Core::Stream::File::exists() to Core::File
`Core::Stream::File` shouldn't hold any utility methods that are
unrelated to constructing a `Core::Stream`, so let's just replace the
existing `Core::File::exists` with the nicer looking implementation.
2022-12-08 12:52:14 +00:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Timothy Flynn
b2164ad979 Meta: Do not hard-code index types for UCD/CLDR/TZDB code generators
Hand-picking the smallest index type that fits a particular generated
array started with commit 3ad159537e. This
was to reduce the size of the generated library.

Since then, the number of types using UniqueStorage has grown a ton,
creating a long list of types for which index types are manually picked.
When a new UCD/CLDR/TZDB is released, and the current index type no
longer fits the generated data, we fail to generate. Tracking down which
index caused the failure is a pretty annoying process.

Instead, we can just use size_t while in the generators themselves, then
automatically pick the size needed for the generated code.
2022-11-18 17:00:51 +00:00
Gunnar Beutner
4e406b0730 Meta+LibUnicode: Avoid relocations for emoji data
Previously each emoji had its own symbol in the library which was then
referred to by another symbol. This caused thousands of avoidable data
relocations at load time.

This saves about 122kB RAM for each process which uses LibUnicode.
2022-11-06 17:34:06 +01:00
Timothy Flynn
b820b9b2ff LibUnicode: Make the generated .h and .cpp paths for emoji data optional
This is to allow people making emoji to run the generator to create the
expected commit message format.
2022-11-03 16:37:04 +00:00
Timothy Flynn
bd592480e4 Meta: Replace Bash script for generating emoji.txt with C++ generator
We currently have two build-time parsers for the UCD's emoji-test.txt
file. To prepare for future changes, this removes the Bash parser and
moves its functionality to the newer C++ parser.
2022-10-27 12:59:56 +02:00
demostanis
3e8b5ac920 AK+Everywhere: Turn bool keep_empty to an enum in split* functions 2022-10-24 23:29:18 +01:00
Timothy Flynn
b7ef36aa36 LibUnicode: Parse and generate custom emoji added for SerenityOS
Parse emoji from emoji-serenity.txt to allow displaying their names and
grouping them together in the EmojiInputDialog.

This also adds an "Unknown" value to the EmojiGroup enum. This will be
useful for emoji that aren't found in the UCD, or for when UCD downloads
are disabled.
2022-09-11 20:33:57 +01:00
Timothy Flynn
0aadd4869d LibUnicode: Generate emoji data for non-fully-qualified emoji
This allows us to find emoji data for files such as /res/emoji/U+A9.png.
U+00A9 is not fully-qualified (its full form is U+00A9 U+FE0F). But the
UCD has unqualified data for this code point; generating it allows us to
categorize these emoji appropriately in the EmojiInputDialog.
2022-09-11 20:33:57 +01:00
Timothy Flynn
b61eca0a1e LibUncode: Parse and generate emoji code point data
According to TR #51, the "best definition of the full set [of emojis] is
in the emoji-test.txt file". This defines not only the emoji themselves,
but the order in which they should be displayed, and what "group" of
emojis they belong to.
2022-09-08 23:12:31 +01:00