`STDERR_FILENO` is pretty common but not part of the standard. On
Windows, in order to get a file number, one must use `_fileno` to
convert the `FILE *` to a file number. Adopt this pattern similar
to the Android path which uses platform specific operations.
We had numerous NiH-based implementations of audio formats and metadata
that we now no longer need because we either don't make use of the code,
or we replaced its implementation by FFmpeg.
First, this isn't actually helpful, as we no longer store 32-bit values
in JsonValue. They are stored as 64-bit values anyways.
But more imporatantly, there was a bug here when trying to coerce an i64
to an i32. All negative values were cast to an i32, without checking if
the value is below NumericLimits<i32>::min.
You can now build with STYLE_INVALIDATION_DEBUG and get a debug stream
of reasons why style invalidations are happening and where.
I've rewritten this code many times, so instead of throwing it away once
again, I figured we should at least have it behind a flag.
This change should move us forward toward emoji support, as we are no
longer limited by our own OpenType implementation, which was failing
to parse the TrueType Collection format used to store emoji fonts
(at least on macOS).
Some callers (LibJS) will want to control the size of the output buffer,
to decode up to a maximum length. They will also want to receive partial
results in the case of an error. This patch adds a method to provide
those capabilities, and makes the existing implementation use it.
Also give the Swift.String init routines an explict label when
constructing from AK String types, as this caused issues in a later
commit to have them both with `_ data`.
At the same time, simplify CMakeLists magic for libraries that want to
include Swift code in the library. The Lib-less name of the library is
now always the module name for the library with any Swift additions,
extensions, etc. All vfs overlays now live in a common location to make
finding them easier from CMake functions. A new pattern is needed for
the Lib-less modules to re-export their Cxx counterparts.
When using a configuration without a swift compiler, we need to no-op
the swift annotations. Other, cleverer solutions beyond the has include
all fell flat in the face of the clang modules implementation used by
swift to parse-once use-everywhere each module.
This requires pulling in some of the STL, but the result is that our
iterator is now STL Approved ™️ and our containers can be
auto-conformed to Swift protocols.
Otherwise, the following code would not compile:
constexpr Array<int, 3> array { 4, 5, 6 };
Vector<int> vector { 4, 5, 6 };
if (array == vector.span()) { }
We do such comparisons in tests quite a bit. But it currently doesn't
become an issue because of the way EXPECT_EQ copies its input parameters
to non-const locals. In a future patch, that copying will be removed,
and the compiler would otherwise complain about not finding a suitable
comparison operator.
The script pulls in a dependency on the `yaml` python package. Instead
of updating all the docs and CI jobs to account for this, let's guard
calling the script behind our experimental flag instead.
`swizzle` had the wrong operands, and the vector masking boolean logic
was incorrect in the internal `shuffle_or_0` implementation. `shuffle`
was previously implemented as a dynamic swizzle, when it uses an
immediate operand for lane indices in the spec.
The underlying CPU-specific instructions for operating on UTF-16 strings
behave differently for null inputs. Add an explicit check for this state
for consistency.
The underlying CPU-specific instructions for operating on UTF-8 strings
behave differently for null inputs. Add an explicit check for this state
for consistency.
Currently, invoking StringBuilder::to_string will re-allocate the string
data to construct the String. This is wasteful both in terms of memory
and speed.
The goal here is to simply hand the string buffer over to String, and
let String take ownership of that buffer. To do this, StringBuilder must
have the same memory layout as Detail::StringData. This layout is just
the members of the StringData class followed by the string itself.
So when a StringBuilder is created, we reserve sizeof(StringData) bytes
at the front of the buffer. StringData can then construct itself into
the buffer with placement new.
Things to note:
* StringData must now be aware of the actual capacity of its buffer, as
that can be larger than the string size.
* We must take care not to pass ownership of inlined string buffers, as
these live on the stack.
Utf16View currently assumes host endianness. Add support for specifying
either big or little endianness (which we mostly just pipe through to
simdutf). This will allow using simdutf facilities with LibTextCodec.
The one behavior difference is that we will now actually fail on invalid
code units with Utf16View::to_utf8(AllowInvalidCodeUnits::No). It was
arguably a bug that this wasn't already the case.
If no header includes the prototype of a function, then it cannot be
used from outside the translation unit it was defined in. In that case,
it should be marked as `static`, in order to avoid possible ODR
problems, unnecessary exported symbols, and allow the compiler to better
optimize those.
If this warning triggers in a function defined in a header, `inline`
needs to be added, otherwise if the header is included in more than one
TU, it will fail to link with a duplicate definition error.
The reason this diff got so big is that Lagom-only code wasn't built
with this flag even in Serenity times.
This necessitates marking bit_cast as ALWAYS_INLINE since emitting it as
a function call there will create an unnecessary potential SSE
registers -> plain registers/memory round-trip.
We currently have 2 base64 coders: one in AK, another in LibWeb for a
"forgiving" implementation. ECMA-262 has an upcoming proposal which will
require a third implementation.
Instead, let's use the base64 implementation that is used by Node.js and
recommended by the upcoming proposal. It handles forgiving decoding as
well.
Our users of AK's implementation should be fine with the forgiving
implementation. The AK impl originally had naive forgiving behavior, but
that was removed solely for performance reasons.
Using http://mattmahoney.net/dc/enwik8.zip (100MB unzipped) as a test,
performance of our old home-grown implementations vs. the simdutf
implementation (on Linux x64):
Encode Decode
AK base64 0.226s 0.169s
LibWeb base64 N/A 1.244s
simdutf 0.161s 0.047s
We no longer have multiple locations including AK (e.g. LibC). So let's
avoid awkwardly defining the AK library across multiple CMake files.
This is to allow more easily adding third-party dependencies to AK in
the future.
As recommended by the CMake docs, let's tolerate systems or setups that
don't have backtrace(3) in the `<execinfo.h>` header file, such as those
using libbacktrace directly.
This large commit also refactors LibWebView's process handling to use
a top-level Application class that uses a new WebView::Process class to
encapsulate the IPC-centric nature of each helper process.
This fixes some errors where too many bytes were allowed to be read for
signed integers of a smaller size (e.g. i32). The new parser doesn't
make 64-bit assumptions and now matches the generality of its unsigned
counterpart.
The [UTF-8](https://datatracker.ietf.org/doc/html/rfc3629#page-5)
standard says to reject strings with upper or lower surrogates. However,
in many standards, ECMAScript included, unpaired surrogates (and
therefore UTF-8 surrogates) are allowed in strings. So, this commit
extends the UTF-8 validation API with `AllowSurrogates`, which will
reject upper and lower surrogate characters.