Instead of polluting global namespace with definitions from
libkern/OSByteOrder.h and machine/endian.h on MacOS, just use AK
functions for conversions.
On argument swapping to put positional ones toward the end,
m_arg_index was pointing at "last arg index" + "skipped args" +
"consumed args" and thus was pointing ahead of the skipped ones.
m_arg_index now points after the current parsed option arguments.
Now it actually only exposes methods to allocate uninitialized storage
and to create substring with a shared superstring. All the details of
the memory layout are fully encapsulated.
The idea is to eventually get rid of protected state in StringBase. To
do this, we first need to remove all references to m_data and
m_short_string from String.
This starts separating memory management of string data and string
utilities like `String::formatted`. This would also allow to reuse the
same storage in `DeprecatedString` in the future.
This method (unlike can_read_line) ensures that the delimiter is present
in the buffer, and doesn't return true after eof when the delimiter is
absent.
`JsonValue::to_byte_string` has peculiar type-erasure semantics which is
not usually intended. Unfortunately, it also has a very stereotypical
name which does not warn about unexpected behavior. So let's prefix it
with `deprecated_` to make new code use `as_string` if it just wants to
get string value or `serialized<StringBuilder>` if it needs to do proper
serialization.
A bunch of users used consume_specific with a constant ByteString
literal, which can be replaced by an allocation-free StringView literal.
The generic consume_while overload gains a requires clause so that
consume_specific("abc") causes a more understandable and actionable
error.
The current algorithm is currently O(N^2) because we forward-search an
ever-increasing substring of the haystack. This implementation reduces
the search time of a 500,000-length string (where the desired needle is
at index 0) from 72 seconds to 2-3 milliseconds.
Caught by clang-format-17. Note that clang-format-16 is fine with this
as well (it leaves the const placement alone), it just doesn't perform
the formatting to east-const itself.
Using a vector to represent a list of painting commands results in many
reallocations, especially on pages with a lot of content.
This change addresses it by introducing a SegmentedVector, which allows
fast appending by representing a list as a sequence of fixed-size
vectors. Currently, this new data structure supports only the
operations used in RecordingPainter, which are appending and iterating.
Much of the UTF-8 data that we'll iterate over will be ASCII only,
and we can get a significant speed-up by simply having a fast path
when the iterator points at a byte that is obviously an ASCII character
(<= 0x7F).
Since we're already building up a percent-encoded ASCII-only string
in the internal parser buffer, there's no need to do a second UTF-8
validation pass before assigning each part of the parsed URL.
This makes URL parsing signficantly faster.
Instead of do a wrappy MUST(try_append_code_point()), we now inline
the UTF-8 encoding logic. This allows us to grow the buffer by the
right increment up front, and also removes a bunch of ErrorOr ceremony
that we don't care about.
Once we know that a code point must be a valid ASCII character,
we now cast it to `char` and avoid the expensive generic
StringView::contains(u32 code_point) checks.
This dramatically speeds up URL parsing.