The one behavior difference is that we will now actually fail on invalid
code units with Utf16View::to_utf8(AllowInvalidCodeUnits::No). It was
arguably a bug that this wasn't already the case.
If no header includes the prototype of a function, then it cannot be
used from outside the translation unit it was defined in. In that case,
it should be marked as `static`, in order to avoid possible ODR
problems, unnecessary exported symbols, and allow the compiler to better
optimize those.
If this warning triggers in a function defined in a header, `inline`
needs to be added, otherwise if the header is included in more than one
TU, it will fail to link with a duplicate definition error.
The reason this diff got so big is that Lagom-only code wasn't built
with this flag even in Serenity times.
This necessitates marking bit_cast as ALWAYS_INLINE since emitting it as
a function call there will create an unnecessary potential SSE
registers -> plain registers/memory round-trip.
We currently have 2 base64 coders: one in AK, another in LibWeb for a
"forgiving" implementation. ECMA-262 has an upcoming proposal which will
require a third implementation.
Instead, let's use the base64 implementation that is used by Node.js and
recommended by the upcoming proposal. It handles forgiving decoding as
well.
Our users of AK's implementation should be fine with the forgiving
implementation. The AK impl originally had naive forgiving behavior, but
that was removed solely for performance reasons.
Using http://mattmahoney.net/dc/enwik8.zip (100MB unzipped) as a test,
performance of our old home-grown implementations vs. the simdutf
implementation (on Linux x64):
Encode Decode
AK base64 0.226s 0.169s
LibWeb base64 N/A 1.244s
simdutf 0.161s 0.047s
We no longer have multiple locations including AK (e.g. LibC). So let's
avoid awkwardly defining the AK library across multiple CMake files.
This is to allow more easily adding third-party dependencies to AK in
the future.
As recommended by the CMake docs, let's tolerate systems or setups that
don't have backtrace(3) in the `<execinfo.h>` header file, such as those
using libbacktrace directly.
This large commit also refactors LibWebView's process handling to use
a top-level Application class that uses a new WebView::Process class to
encapsulate the IPC-centric nature of each helper process.
This fixes some errors where too many bytes were allowed to be read for
signed integers of a smaller size (e.g. i32). The new parser doesn't
make 64-bit assumptions and now matches the generality of its unsigned
counterpart.
The [UTF-8](https://datatracker.ietf.org/doc/html/rfc3629#page-5)
standard says to reject strings with upper or lower surrogates. However,
in many standards, ECMAScript included, unpaired surrogates (and
therefore UTF-8 surrogates) are allowed in strings. So, this commit
extends the UTF-8 validation API with `AllowSurrogates`, which will
reject upper and lower surrogate characters.