There's no need to copy the result. We can also avoid increasing the
size of the output buffer by 1 for each written byte.
This reduces the runtime of `./bin/base64 -d enwik8.base64 >/dev/null`
from 0.917s to 0.632s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
We don't really need the features provided by StringBuilder here, since
we know the exact size of the output. Avoiding StringBuilder avoids the
recurring capacity/size checks both within StringBuilder itself and its
internal ByteBuffer.
This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
0.976s to 0.428s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
We know we are only appending ASCII characters to the StringBuilder, so
do not bother validating the result.
This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
1.192s to 0.976s.
(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
This encoding scheme comes from section 5 of RFC 4648, as an
alternative to the standard base64 encode/decode methods.
The only difference is that the last two characters are replaced
with '-' and '_', as '+' and '/' are not safe in URLs or filenames.
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.
This change has two main benefits:
* Moving AK back more towards being an agnostic library that can
be used between the kernel and userspace. URL has never really fit
that description - and is not used in the kernel.
* URL _should_ depend on LibUnicode, as it needs punnycode support.
However, it's not really possible to do this inside of AK as it can't
depend on any external library. This change brings us a little closer
to being able to do that, but unfortunately we aren't there quite
yet, as the code generators depend on LibCore.
This was copying the vector behind our backs, let's remove it and make
the copying explicit by putting it behind COWVector::mutable_at().
This is a further 64% performance improvement on Wasm validation.
Otherwise, we percent-encode negative signed chars incorrectly. For
example, https://www.strava.com/login contains the following hidden
<input> field:
<input name="utf8" type="hidden" value="✓" />
On submitting the form, we would percent-encode that field as:
utf8=%-1E%-64%-6D
Which would cause us to receive an HTTP 500 response. We now properly
percent-encode that field as:
utf8=%E2%9C%93
And can login to Strava :^)
We already have a helper to split a StringView by line while considering
"\n", "\r", and "\r\n". Add an analagous method to just count the number
of lines in the same manner.
If the BufferedStream is able to fill its entire circular buffer in
populate_read_buffer() and is later asked to read a line or read until
a delimiter, it could erroneously return EMSGSIZE if the caller's buffer
was smaller than the internal buffer. In this case, all we really care
about is whether the caller's buffer is big enough for however much data
we're going to copy into it. Which needs to take into account the
candidate.
The main difference between them is that IntrusiveBinaryHeap can
optionally maintain an index inside every stored node that allows
arbitrary nodes to be deleted.
This is a simple extension of GenericLexer, and is used in more than
just LibXML, so let's move it into AK.
The move also resolves a FIXME, which is removed in this commit.
This makes it possible to use MakeIndexSequqnce in functions like:
template<typename T, size_t N>
constexpr auto foo(T (&a)[N])
This means AK/StdLibExtraDetails.h must now include AK/Types.h
for size_t, which means AK/Types.h can no longer include
AK/StdLibExtras.h (which arguably it shouldn't do anyways),
which requires rejiggering some things.
(IMHO Types.h shouldn't use AK::Details metaprogramming at all.
FlatPtr doesn't necessarily have to use Conditional<> and ssize_t could
maybe be in its own header or something. But since it's tangential to
this PR, going with the tried and true "lift things that cause the
cycle up to the top" approach.)
Instead of polluting global namespace with definitions from
libkern/OSByteOrder.h and machine/endian.h on MacOS, just use AK
functions for conversions.
On argument swapping to put positional ones toward the end,
m_arg_index was pointing at "last arg index" + "skipped args" +
"consumed args" and thus was pointing ahead of the skipped ones.
m_arg_index now points after the current parsed option arguments.
Now it actually only exposes methods to allocate uninitialized storage
and to create substring with a shared superstring. All the details of
the memory layout are fully encapsulated.