Commit graph

3669 commits

Author SHA1 Message Date
Dan Klishch
7e9dc9c1fd AK: Use bit_cast in SIMDExtras.h/AK::Detail::byte_reverse_impl
This necessitates marking bit_cast as ALWAYS_INLINE since emitting it as
a function call there will create an unnecessary potential SSE
registers -> plain registers/memory round-trip.
2024-07-17 09:56:13 -06:00
Hendiadyoin1
9c583154b0 AK: Add generic SIMD shuffle/reverse functions
(cherry picked from commit 1b8fd5c35afda8f797f1e8a39c332fa14950006e)
2024-07-17 09:56:13 -06:00
Hendiadyoin1
873b03f661 AK: Add generic SIMD vector load/store functions
(cherry picked from commit 27c386797df64b9c4dcbe6a27e57d9f54837e9b4)
2024-07-17 09:56:13 -06:00
Hendiadyoin1
9ee334e970 AK: Add introspection helpers to SIMD.h
(cherry picked from commit 8d6028d366c918b3656c0a4c6808a570dcecf8f4)
2024-07-17 09:56:13 -06:00
Timothy Flynn
f29c3684a6 AK: Enable ASSERT in debug builds only
NDEBUG is defined in release builds. So we want to enable the ASSERT
macro when it *isn't* defined.
2024-07-17 09:45:43 -06:00
Timothy Flynn
bfc9dc447f AK+LibWeb: Replace our home-grown base64 encoder/decoders with simdutf
We currently have 2 base64 coders: one in AK, another in LibWeb for a
"forgiving" implementation. ECMA-262 has an upcoming proposal which will
require a third implementation.

Instead, let's use the base64 implementation that is used by Node.js and
recommended by the upcoming proposal. It handles forgiving decoding as
well.

Our users of AK's implementation should be fine with the forgiving
implementation. The AK impl originally had naive forgiving behavior, but
that was removed solely for performance reasons.

Using http://mattmahoney.net/dc/enwik8.zip (100MB unzipped) as a test,
performance of our old home-grown implementations vs. the simdutf
implementation (on Linux x64):

                Encode    Decode
AK base64       0.226s    0.169s
LibWeb base64   N/A       1.244s
simdutf         0.161s    0.047s
2024-07-16 10:27:39 +02:00
Timothy Flynn
58dfe5424f AK: Make the AK library's CMake a bit more standard
We no longer have multiple locations including AK (e.g. LibC). So let's
avoid awkwardly defining the AK library across multiple CMake files.

This is to allow more easily adding third-party dependencies to AK in
the future.
2024-07-16 10:27:39 +02:00
Andreas Kling
df18a76ad2 AK: Add ASSERT() and ASSERT_NOT_REACHED() for debug-only assertions
Let's move towards using these for things that are "nice to check in
debug builds, but not essential".
2024-07-10 07:03:20 +02:00
Diego
aee2f25929 AK: Add remaining method to ConstrainedStream
Simply returns how many bytes can be read from the stream.
2024-07-09 14:22:31 +02:00
Tim Ledbetter
634f2f655b AK: Allow escaping of keys in SourceGenerator
This allows the opening and closing characters of the SourceGenerator
to be used in the source text to be used for purposes other than keys.
2024-07-09 11:21:07 +02:00
Andrew Kaster
fc7af577fc AK: Ignore -Wstring-op-overflow in another ByteBuffer instance
gcc 14.1 from Fedora 40 likes to warn on this on aarch64.
2024-07-07 15:56:59 +02:00
Salem Yaslem
ab82fc8993 LibCore: Support IPv6 for TCP and UDP connection 2024-07-05 14:26:22 -06:00
Dennis Camera
186057bf92 AK: Add TODO_PPC* assertions 2024-07-05 09:50:13 -06:00
Dennis Camera
ffe2f16c58 AK: Add AK_IS_ARCH defines for PowerPC CPU architecture 2024-07-05 09:50:13 -06:00
Dennis Camera
b54a1c6284 AK: Implement ShortString for big-endian 2024-07-05 09:49:23 -06:00
Dennis Camera
b4d13d060a AK: Fix {:c} formatter for big-endian 2024-07-05 09:48:15 -06:00
Dennis Camera
1bc44376c0 AK: Implement floating-point conversions for big-endian 2024-07-05 09:47:08 -06:00
Timothy Flynn
698a95d2de AK: Decode paired UTF-16 surrogates in a JSON string
For example, such use is seen on Twitter.
2024-07-04 14:16:16 +02:00
Timothy Flynn
c39a3fef17 AK: Make a couple of GenericLexer helper methods protected
We will want to use the exact behavior of these methods in JsonParser.
2024-07-04 14:16:16 +02:00
Andrew Kaster
002bef8635 AK+CMake: Use the find module to find the correct backtrace(3) header
As recommended by the CMake docs, let's tolerate systems or setups that
don't have backtrace(3) in the `<execinfo.h>` header file, such as those
using libbacktrace directly.
2024-07-01 10:15:24 -06:00
Andrew Kaster
4cc3d598f9 LibWebView+LibCore: Manage process lifecycle using a SIGCHLD handler
This large commit also refactors LibWebView's process handling to use
a top-level Application class that uses a new WebView::Process class to
encapsulate the IPC-centric nature of each helper process.
2024-07-01 18:10:56 +02:00
Ali Mohammad Pur
58fc901578 AK: Add a formatter for OwnPtr<T>
This formatter just prints the object out as a pointer.
2024-06-26 05:47:16 +02:00
Zaggy1024
bbd8a218a5 AK: Prevent overflow of the min when clamping unsigned values to signed
Also, add some tests for the cases that were broken before.
2024-06-24 12:41:32 -06:00
circl
9f7f6aa80c LibTLS: Remove key-logging debug feature
This attempted to save data into /home/anon even on Linux
2024-06-24 09:45:41 -06:00
Diego
596dd5252d AK: Read signed LEB128 integers without 64-bit assumptions
This fixes some errors where too many bytes were allowed to be read for
signed integers of a smaller size (e.g. i32). The new parser doesn't
make 64-bit assumptions and now matches the generality of its unsigned
counterpart.
2024-06-18 16:58:33 +02:00
Andreas Kling
b88e0eb50a AK: Remove unused Complex.h 2024-06-18 12:00:14 +02:00
Andreas Kling
fe9af7c972 AK: Remove unused StackUnwinder.h 2024-06-18 12:00:14 +02:00
Andreas Kling
fe1aec124e AK: Remove unused ArbitrarySizedEnum class 2024-06-18 12:00:14 +02:00
Andreas Kling
d8f2a885f9 AK: Remove unused JsonPath class 2024-06-18 12:00:14 +02:00
Andreas Kling
7f5e960b72 AK: Remove unused UUID class 2024-06-18 12:00:14 +02:00
Andreas Kling
47287d2cf1 AK: Remove kstdio.h and dbgputstr()
We can just write directly to stderr in the one place this was used.
2024-06-18 12:00:14 +02:00
Andreas Kling
6df5785fc4 AK: Remove unused PrintfImplementation.h 2024-06-18 12:00:14 +02:00
Tim Ledbetter
5ca2f4dfd7 Everywhere: Remove all KERNEL #defines 2024-06-18 09:36:25 +02:00
Andreas Kling
1039acca8c LibGfx: Remove JPEG2000 image format support
This format is not supported by other browsers.
2024-06-17 21:57:35 +02:00
Andreas Kling
a34a5af939 LibGfx: Remove ILBM image format support
This format is not supported by other browsers.
2024-06-17 21:57:35 +02:00
Andreas Kling
b6daddb088 LibGfx: Remove JBIG2 image format support
This format is not supported by other browsers.
2024-06-17 21:57:35 +02:00
Andreas Kling
681a2ac14e LibGfx: Remove support for the various "portable" image formats
These formats are not supported by other browsers.
2024-06-17 21:57:35 +02:00
Andreas Kling
7141319a7c LibGfx: Remove DDS image format support
This format is not supported by other browsers.
2024-06-17 21:57:35 +02:00
Andreas Kling
2a888ca626 LibGfx: Remove home-grown JPEG codec in favor of libjpeg-turbo 2024-06-17 17:59:54 +02:00
Daniel Bertalan
397774d422 Everywhere: Remove usages of template keyword with no parameter list
These were made invalid with P1787, and Clang (19) trunk started warning
on them with https://github.com/llvm/llvm-project/pull/80801.
2024-06-16 07:19:56 -04:00
Diego
7560b640f3 AK: Add AllowSurrogates to UTF-8 validator
The [UTF-8](https://datatracker.ietf.org/doc/html/rfc3629#page-5)
standard says to reject strings with upper or lower surrogates. However,
in many standards, ECMAScript included, unpaired surrogates (and
therefore UTF-8 surrogates) are allowed in strings. So, this commit
extends the UTF-8 validation API with `AllowSurrogates`, which will
reject upper and lower surrogate characters.
2024-06-09 12:16:32 +02:00
circl
666f7338a0 Meta+AK: Clear out unused debug macro definitions 2024-06-09 10:48:19 +02:00
Timothy Flynn
8362c073f3 Everywhere: Remove LibSQL, SQLServer, and the sql REPL :^)
It is now entirely unused and replaced by sqlite3.
2024-06-06 11:27:03 -04:00
Andreas Kling
6321e97b09 AK: Remove various unused things 2024-06-04 09:19:39 +02:00
Andreas Kling
e70d96e4e7 Everywhere: Remove a lot more things we don't need 2024-06-03 10:53:53 +02:00
Tim Ledbetter
1a4fbfe495 Everywhere: Remove references to the kernel 2024-06-03 10:53:53 +02:00
Timothy Flynn
fe3fde2411 AK+LibUnicode: Implement a case-insensitive variant of find_byte_offset
The existing String::find_byte_offset is case-sensitive. This variant
allows performing searches using Unicode-aware case folding.
2024-06-01 07:37:54 +02:00
Daniel Bertalan
637ccacce5 AK: Enable format string checking in Clang builds
Format string checking was disabled in Clang-based builds due to a
compiler bug: https://github.com/llvm/llvm-project/issues/51182. Now
that the requirement has been raised to Clang 17, that is no longer
necessary.

This has been tested to work correctly with Apple Clang 15.0.0 (which is
the *least modern* supported compiler), as well as CLion 2024.1's
bundled Clangd.
2024-05-29 13:34:15 -06:00
Matthew Olsson
e0d6afbabe ClangPlugins: Invert the lambda detection escape mechanism
Instead of being opt-out with NOESCAPE, it is now opt-in with ESCAPING.
Opt-out is ideal, but unfortunately this was extremely noisy when
compiling the entire codebase. Escaping functions are rarer than non-
escaping ones, so let's just go with that for now.

This also allows us to gradually add heuristics for detecting missing
ESCAPING annotations and emitting them as errors. It also nicely matches
the spelling that Swift uses (@escaping), which is where this idea
originally came from.
2024-05-22 21:55:34 -06:00
Matthew Olsson
a5f4c9a632 AK+Userland: Remove NOESCAPE
See the next commit for an explanation
2024-05-22 21:55:34 -06:00
Dan Klishch
38b51b791e AK+Kernel+LibVideo: Include workarounds for missing P0960 only in Xcode
With this change, ".*make.*" function family now does error checking
earlier, which improves experience while using clangd. Note that the
change also make them instantiate classes a bit more eagerly, so in
LibVideo/PlaybackManager, we have to first define SeekingStateHandler
and only then make() it.

Co-Authored-By: stelar7 <dudedbz@gmail.com>
2024-05-21 14:24:59 +02:00
Tim Ledbetter
d0d81e470e AK: Fix off by one error in integral ceil_log2()
Previously, certain values of `ceil_log2(x)` would be 1 smaller than
`ceil(log2(x))`.
2024-05-21 09:31:17 +02:00
Dan Klishch
be36dbce7d AK: Don't put element count next to heap-allocated data in FixedArray
This not only makes code easier to follow but also makes it faster.
2024-05-18 18:30:42 +02:00
Lucas CHOLLET
c6e4563489 AK: Export Statistics to the global namespace 2024-05-18 18:30:07 +02:00
Andreas Kling
b2e6843055 LibJS+AK: Fix integer overflow UB on (any Int32 - -2147483648)
It wasn't safe to use addition_would_overflow(a, -b) to check if
subtraction (a - b) would overflow, since it doesn't cover this case.

I don't know why we didn't have subtraction_would_overflow(), so this
patch adds it. :^)
2024-05-18 18:11:50 +02:00
Sönke Holz
b6cc95c38e AK: Add a function for frame pointer-based stack unwinding
Instead of duplicating stack unwinding code everywhere, introduce a new
AK helper to unwind the stack in a generic way.
2024-05-14 14:02:06 -06:00
ptrcnull
13e44ab035 AK: Add stack size fixup for musl libc
Fixes #16681
2024-05-14 13:56:45 -06:00
Andreas Kling
6b2b90d2b0 AK: Remove AK_HAS_CONDITIONALLY_TRIVIAL
Code behind this appears to compile nicely with Clang 17 and later.
2024-05-10 15:03:24 +00:00
implicitfield
f923016e0b AK: Add reinterpret_as_octal()
This is useful for parsing user-provided integers that should be
interpreted as octals.
2024-05-07 16:54:27 -06:00
Abuneri
b5bed37074 AK: Replace FP math in is_power_of with a purely integral algorithm
The previous naive approach was causing test failures because of
rounding issues in some exotic environments. In particular, MSVC
via MSBuild
2024-05-07 16:43:34 -06:00
Andreas Kling
ebe6ec6069 AK: Check for u32 overflow in String::repeated()
I don't know why this was checking for size_t overflow, but it was
tripping up ASAN malloc() checks by passing a way-too-large size.
2024-05-07 09:15:40 +02:00
Nico Weber
c421a3d7ce AK: Add missing using statements to Find.h 2024-05-06 17:32:19 +02:00
Sergey Bugaev
0bb37f9c0e AK: Include <features.h> before checking for platform macros
AK/Platform.h did not include any other header file, but expected
various macros to be defined. While many of the macros checked here are
predefined by the compiler (i.e. GCC's TARGET_OS_CPP_BUILTINS), some
may be defined by the system headers instead. In particular, so is
__GLIBC__ on glibc-based systems.

We have to include some system header for getting __GLIBC__ (or not).
It could be possible to include something relatively small and
innocuous, like <string.h> for example, but that would still clutter
the name space and make other code that would use <string.h>
functionality, but forget to include it, build on accident; we wouldn't
want that. At the end of the day, the header that actually defines
__GLIBC__ (or not) is <features.h>. It's typically included from other
glibc headers, and not by user code directly, which makes it unlikely
to mask other code accidentlly forgetting to include it, since it
wouldn't include it in the first place.

<features.h> is not defined by POSIX and could be missing on other
systems (but it seems to be present at least when using either glibc or
musl), so guard its inclusion with __has_include().

Specifically, this fixes AK/StackInfo.cpp not picking up the glibc code
path in the cross aarch64-gnu (GNU/Hurd on 64-bit ARM) Lagom build.
2024-05-02 07:46:53 -06:00
Tim Ledbetter
8b01abf9f7 AK: Don't move trivially copyable types in BufferedStream methods 2024-04-30 13:22:56 +02:00
Liav A.
122c82a2a1 AK: Add the SetOnce class
The SetOnce class is meant to be used as one-time set boolean flag,
which is useful for flags that change only once and then stay immutable
forever.
2024-04-26 23:46:23 -06:00
Nico Weber
88d0702763 AK: Make ceil_div() handle one argument being negative correctly
`ceil_div(-1, 2)` used to return -1.
Now it returns 0, which is the correct ceil(-0.5).

(C++'s division semantics have floor semantics for numbers > 0,
but ceil semantics for numbers < 0.)

This will be important for the JPEG2000 decoder eventually.
2024-04-27 07:09:08 +02:00
Timothy Flynn
fecd08ce64 Everywhere: Remove 'clang-format off' comments that are no longer needed 2024-04-24 16:50:01 -04:00
Timothy Flynn
ec492a1a08 Everywhere: Run clang-format
The following command was used to clang-format these files:

    clang-format-18 -i $(find . \
        -not \( -path "./\.*" -prune \) \
        -not \( -path "./Base/*" -prune \) \
        -not \( -path "./Build/*" -prune \) \
        -not \( -path "./Toolchain/*" -prune \) \
        -not \( -path "./Ports/*" -prune \) \
        -type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")

There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
2024-04-24 16:50:01 -04:00
kleines Filmröllchen
8443d0a74d AK: Use common ComponentType integer type for float bitfields
This allows us to easily use an appropriate integer type when performing
float bitfield operations.

This change also adds a comment about the technically-incorrect 80-bit
extended float mantissa field.
2024-04-23 19:18:09 -06:00
Andrew Kaster
913cffe928 AK: Add workaround for faulty Sanitizer warning on gcc 13+ in Atomic
gcc can't seem to figure out that the address of a member variable of
AK::Atomic<u32> in AtomicRefCounted cannot be null when fetch_sub-ing.
Add a bogus condition to convince the compiler that it can't be null.
2024-04-23 15:37:07 -06:00
dgaston
08aaf4fb07 AK: Add methods to BufferedStream to resize the user supplied buffer
These changes allow lines of arbitrary length to be read with
BufferedStream. When the user supplied buffer is smaller than
the line, it will be resized to fit the line. When the internal
buffer in BufferedStream is smaller than the line, it will be
read into the user supplied buffer chunk by chunk with the
buffer growing accordingly.

Other behaviors match the behavior of the existing read_line method.
2024-04-21 11:46:55 +02:00
Jess
ecb7d4b40f LibJS: Throw RangeError in StringPrototype::repeat if OOM
currently crashes with an assertion failure in `String::repeated` if
malloc can't serve a `count * input_size` sized request, so add
`String::repeated_with_error` to propagate the error.
2024-04-20 19:23:46 -04:00
Andrew Kaster
1e749d023a AK: Add fallible dequeue method to Queue 2024-04-19 16:38:55 -04:00
Dan Klishch
5ed7cd6e32 Everywhere: Use east const in more places
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
2024-04-19 06:31:19 -04:00
implicitfield
1159cd9390 AK+Kernel+LibSanitizer: Implement __ubsan_handle_function_type_mismatch 2024-04-18 13:14:33 -06:00
Space Meyer
fdc0328ce3 Kernel: Exclude individual functions from coverage instrumentation
Sticking this to the function source has multiple benefits:
- We instrument more code, by not excluding entire files.
- NO_SANITIZE_COVERAGE can be used in Header files.
- Keeping the info with the source code, means if a function or
  file is moved around, the NO_SANITIZE_COVERAGE moves with it.
2024-04-15 21:16:22 -06:00
Space Meyer
7d8431dcfc AK: Toolchain dependend instrumentation __attribute__
GCC sometimes complains about the The `no_sanitize("address")` syntax,
and clang sometimes complains abouth the `no_sanitize_address` syntax.
Both claim to support both, so that's neat!
2024-04-15 21:16:22 -06:00
Andrew Kaster
8c5e64e686 Ladybird+LibWebView: Add mechanism to get Mach task port for helpers
On macOS, it's not trivial to get a Mach task port for your children.
This implementation registers the chrome process as a well-known
service with launchd based on its pid, and lets each child process
send over a reference to its mach_task_self() back to the chrome.

We'll need this Mach task port right to get process statistics.
2024-04-09 16:43:27 -06:00
Andrew Kaster
4a9546a7c8 AK: Add platform macro for Mach-based operating system environments 2024-04-09 16:43:27 -06:00
Matthew Olsson
76fa127cbf LibJSGCVerifier: Detect stack-allocated ref captures in lambdas
For example, consider the following code snippet:

    Vector<Function<void()>> m_callbacks;
    void add_callback(Function<void()> callback)
    {
    	m_callbacks.append(move(callback));
    }

    // Somewhere else...
    void do_something()
    {
    	int a = 10;
    	add_callback([&a] {
            dbgln("a is {}", a);
    	});
    } // Oops, "a" is now destroyed, but the callback in m_callbacks
      // has a reference to it!

We now statically detect the capture of "a" in the lambda above and flag
it as incorrect. Note that capturing the value implicitly with a capture
list of `[&]` would also be detected.

Of course, many functions that accept Function<...> don't store them
anywhere, instead immediately invoking them inside of the function. To
avoid a warning in this case, the parameter can be annotated with
NOESCAPE to indicate that capturing stack variables is fine:

    void do_something_now(NOESCAPE Function<...> callback)
    {
    	callback(...)
    }

Lastly, there are situations where the callback does generally escape,
but where the caller knows that it won't escape long enough to cause any
issues. For example, consider this fake example from LibWeb:

    void do_something()
    {
    	bool is_done = false;
    	HTML::queue_global_task([&] {
            do_some_work();
            is_done = true;
        });
    	HTML::main_thread_event_loop().spin_until([&] {
            return is_done;
        });
    }

In this case, we know that the lambda passed to queue_global_task will
be executed before the function returns, and will not persist
afterwards. To avoid this warning, annotate the type of the capture
with IGNORE_USE_IN_ESCAPING_LAMBDA:

    void do_something()
    {
   	IGNORE_USE_IN_ESCAPING_LAMBDA bool is_done = false;
    	// ...
    }
2024-04-09 09:10:44 +02:00
stelar7
3f1019b089 AK: Add XOR method to ByteBuffer 2024-04-08 09:34:49 -06:00
Shannon Booth
8c34842962 AK: Simplify and optimize ASCIICaseInsensitiveFlyStringTraits::equals
The member function `equals_ignoring_ascii_case` has a fast path which
will return early if it is the same FlyString instance.
2024-04-06 09:17:51 -04:00
Timothy Flynn
c5c5e52c24 AK: Disallow calling ByteString methods that return a view on rvalues
This prevents, for example:

    StringView view = ByteString { "foo" }.view();

This prevents a class of potential UAF.
2024-04-04 11:23:21 +02:00
Timothy Flynn
de80f544d8 AK: Disallow calling String methods that return a view on rvalues
This prevents, for example:

    StringView view = "foo"_string.bytes_as_string_view();

This prevents a class of potential UAF.
2024-04-04 11:23:21 +02:00
Timothy Flynn
b5f22b6e90 AK+Userland: Remove some needlessly explicit conversions to StringView 2024-04-04 11:23:21 +02:00
Timothy Flynn
e0bddbb65e AK: Add a Stream::write_until_depleted overload for string types
All string types currently have to invoke this function as:

    stream.write_until_depleted("foo"sv.bytes());

This isn't very ergonomic, but more importantly, this overload will
allow String/ByteString instances to be written in this manner once
e.g. `ByteString::view() &&` is deleted.
2024-04-04 11:23:21 +02:00
Timothy Flynn
c7ea710b55 AK: Return a constant reference from JsonValue::as_string
Rather than making a copy of the held string, this returns a reference
so that expressions like the following:

    do_something(json.as_string().view());

are not disallowed once `ByteString::view() &&` is deleted.
2024-04-04 11:23:21 +02:00
Andreas Kling
3881717103 LibJS+AK: Register GC memory as root regions for LeakSanitizer
This should fix the gigantic list of false positives dumped by
LeakSanitizer on exit .
2024-04-03 12:41:02 +02:00
Hendiadyoin1
877cfe1890 AK: Move generalized internals of UFixedBigIntDivision to BigIntBase
We will reuse this in LibCrypto

Co-Authored-By: Dan Klishch <danilklishch@gmail.com>
2024-03-25 14:26:29 -06:00
Hendiadyoin1
9045840e33 AK: Use correct wide integer type for qhat check in UFixedBigIntDivision
Previously, we were assuming that were always on a 64-bit platform,
which is not 100% correct
2024-03-25 14:26:29 -06:00
Hendiadyoin1
f95abe8c0e AK: Make BigIntBase more agnostic to non native word sizes
This will allow us to use it in Crypto::UnsignedBigInteger, which always
uses 32 bit words
2024-03-25 14:26:29 -06:00
Nico Weber
1ab28276f6 LibGfx: Add the start of a JPEG2000 loader
JPEG2000 is the last image format used in PDF filters that we
don't have a loader for. Let's change that.

This adds all the scaffolding, but no actual implementation yet.
2024-03-25 20:35:00 +01:00
Nico Weber
07750774cf AK: Allow creating a MaybeOwned<Superclass> from a MaybeOwned<Subclass> 2024-03-25 20:35:00 +01:00
Andreas Kling
2b8a920a7c AK: Don't blindly use SipHash as default hash function
Although it has some interesting properties, SipHash is brutally slow
compared to our previous hash function. Since its introduction, it has
been highly visible in every profile of doing anything interesting with
LibJS or LibWeb.

By switching back, we gain a 10x speedup for 32-bit hashes, and "only"
a 3x speedup for 64-bit hashes.

This comes out to roughly 1.10x faster HashTable insertion, and roughly
2.25x faster HashTable lookup. Hashing is no longer at the top of
profiles and everything runs measurably faster.

For security-sensitive hash tables with user-controlled inputs, we can
opt into SipHash selectively on a case-by-case basis. The vast majority
of our uses don't fit that description though.
2024-03-25 12:39:23 +01:00
Timothy Flynn
7e38653492 AK: Reject invalid Base64 encoded string lengths 2024-03-25 08:13:27 +01:00
Timothy Flynn
4ecf4c7617 AK: Compute the exact size of decoded Base64 strings 2024-03-25 08:13:27 +01:00
Timothy Flynn
754ff41b9c AK: Remove whitespace skipping feature from AK's Base64 decoder
This was added in commit f2663f477f as a
partial implementation of what is now LibWeb's forgiving Base64 decoder.
All use cases within LibWeb that require whitespace skipping now use
that implementation instead.

Removing this feature from AK allows us to know the exact output size of
a decoded Base64 string. We can still trim whitespace at the start and
end of the input though; for example, this is useful when reading from a
file that may have a newline at the end of the file.
2024-03-25 08:13:27 +01:00
Timothy Flynn
690db10463 AK: Convert Base64 template parameters to regular function parameters
The generated function name is otherwise very long, which makes stack
traces a bit more difficult to sift through.
2024-03-25 08:13:27 +01:00
Timothy Flynn
f292746134 AK: Convert some west-consts to east-const in Base64.cpp
Caught by clang-format-17. Note that clang-format-16 is fine with this
as well (it leaves the const placement alone), it just doesn't perform
the formatting to east-const itself.
2024-03-25 08:13:27 +01:00
Andreas Kling
3bdfca1119 AK: Make FlyString::from_utf8*() avoid allocation if possible
If we already have a FlyString instantiated for the given string,
look that up and return it instead of making a temporary String just to
use as a key into the FlyString table.
2024-03-24 13:28:24 +01:00
Andreas Kling
8d7a1e5654 LibWeb: Skip some redundant UTF-8 validation in CSS tokenizer
If we're just adding code points to a StringBuilder, there's no need to
revalidate the result.
2024-03-24 13:28:24 +01:00
Andreas Kling
a88799c032 AK: Remove excessive hashing caused by FlyString table
Before this change, the global FlyString table looked like this:

    HashMap<StringView, Detail::StringBase>

After this change, we have:

    HashTable<Detail::StringData const*, FlyStringTableHashTraits>

The custom hash traits are used to extract the stored hash from
StringData which avoids having to rehash the StringView repeatedly like
we did before.

This necessitated a handful of smaller changes to make it work.
2024-03-24 13:28:24 +01:00
Andreas Kling
8bfad24708 AK: Move AK::Detail::StringData to its own header file
This will allow us to access it from FlyString.cpp
2024-03-24 13:28:24 +01:00
Dan Klishch
45a0ba2167 AK: Introduce AK::enumerate
Co-Authored-By: Tim Flynn <trflynn89@pm.me>
2024-03-23 09:02:58 -04:00
Stanisław Wiśniewski
994fe0b89f AK: Use else if constexpr in explode_byte() 2024-03-21 14:35:20 -06:00
Timothy Flynn
81ad6de41b AK: Avoid creating an intermediate buffer when decoding a Base64 string
There's no need to copy the result. We can also avoid increasing the
size of the output buffer by 1 for each written byte.

This reduces the runtime of `./bin/base64 -d enwik8.base64 >/dev/null`
from 0.917s to 0.632s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Timothy Flynn
0fd7ad09a0 AK: Avoid StringBuilder when creating a Base64-encoded string
We don't really need the features provided by StringBuilder here, since
we know the exact size of the output. Avoiding StringBuilder avoids the
recurring capacity/size checks both within StringBuilder itself and its
internal ByteBuffer.

This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
0.976s to 0.428s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Timothy Flynn
5f5b8ee9bb AK: Do not perform UTF-8 validation on Base64-encoded strings
We know we are only appending ASCII characters to the StringBuilder, so
do not bother validating the result.

This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
1.192s to 0.976s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Andrew Kaster
e9b16970fe AK: Add base64url encoding and decoding methods
This encoding scheme comes from section 5 of RFC 4648, as an
alternative to the standard base64 encode/decode methods.

The only difference is that the last two characters are replaced
with '-' and '_', as '+' and '/' are not safe in URLs or filenames.
2024-03-20 12:18:57 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Andreas Kling
6724f840cd AK: Early return from empty hash table lookups to avoid hashing
When calling get() or find() on an empty HashTable or HashMap, we can
avoid hashing the sought-after key.
2024-03-16 14:27:59 +01:00
Timothy Flynn
e4213f5767 AK: Generalize Span::contains_slow to use the Traits infrastructure
This allows, for example, checking if a Span<String> contains a value
without having to allocate a String.
2024-03-16 08:42:33 +01:00
Timothy Flynn
faf4ba63c2 AK: Don't use east-constexpr in Span methods 2024-03-16 08:42:33 +01:00
Ali Mohammad Pur
d451f84f31 LibCrypto: Add a minimal DER encoder
Progress towards #23562.
2024-03-16 01:17:02 -06:00
Andreas Kling
d125a76f85 AK: Make FlyString-to-FlyString comparison inline & trivial
This should never boil down to more than a machine word comparison.
2024-03-14 12:42:08 +01:00
Ali Mohammad Pur
8003bde03d AK+LibRegex+LibWasm: Remove the non-const COWVector::operator[]
This was copying the vector behind our backs, let's remove it and make
the copying explicit by putting it behind COWVector::mutable_at().
This is a further 64% performance improvement on Wasm validation.
2024-03-12 17:10:47 +01:00
Ali Mohammad Pur
cefe177a56 AK+LibRegex: Move COWVector to AK
This is about to gain a new user, so move it to AK.
2024-03-12 17:10:47 +01:00
Timothy Flynn
e3b5e24ce0 AK: Iterate the bytes of a URL query with an unsigned type
Otherwise, we percent-encode negative signed chars incorrectly. For
example, https://www.strava.com/login contains the following hidden
<input> field:

    <input name="utf8" type="hidden" value="✓" />

On submitting the form, we would percent-encode that field as:

    utf8=%-1E%-64%-6D

Which would cause us to receive an HTTP 500 response. We now properly
percent-encode that field as:

    utf8=%E2%9C%93

And can login to Strava :^)
2024-03-10 15:17:31 +01:00
Nico Weber
58838db445 LibGfx: Add the start of a JBIG2 loader
JBIG2 is infamous for two things:

1. It's used in xerox scanners were it falsifies scanned numbers:

https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning

2. It was allegedly used in an iOS zero day, in a very cool way:

https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html

Needless to say, we need support for it in Serenity. (...because it's
used in PDF files.)

This adds all the scaffolding, but no actual implementation yet.

It's enough for `file` to print the mime type of .jb2 files, but `image`
can't do anything with the files yet.
2024-03-09 16:01:22 +01:00
Timothy Flynn
82ea53cf10 AK: Add a StringView method to count the number of lines in a string
We already have a helper to split a StringView by line while considering
"\n", "\r", and "\r\n". Add an analagous method to just count the number
of lines in the same manner.
2024-03-08 14:43:33 -05:00
Timothy Flynn
07a27b2ec0 AK: Replace the boolean parameter of StringView::lines with a named enum 2024-03-08 14:43:33 -05:00
Matthew Olsson
a511f1ef85 AK: Add HashMap::ensure_capacity 2024-03-06 07:45:56 +01:00
Filiph Siitam Sandström
fd694e8672 AK+Lagom: Make it possible to build for iOS
This commit makes it possible to build AK and most of Lagom for iOS,
based on the work for the Ladybird build demoed on discord:
https://discord.com/channels/830522505605283862/830525031720943627/1211987732646068314
2024-03-03 13:13:42 -07:00
Hendiadyoin1
79fd8eb28d AK/HashMap: Use structured bindings when iterating over itself 2024-03-01 14:05:53 -07:00
Nico Weber
f8b8d1b3be AK: Add is_ascii_uppercase_hex_digit() 2024-03-01 14:17:42 +01:00
Timothy Flynn
d878975f95 AK+LibJS: Remove OFFSET_OF and its users
With the LibJS JIT removed, let's not expose pointers to internal
members.
2024-02-29 09:00:00 +01:00
Andrew Kaster
21ac431fac AK: Allow reading from EOF buffered streams better in read_line()
If the BufferedStream is able to fill its entire circular buffer in
populate_read_buffer() and is later asked to read a line or read until
a delimiter, it could erroneously return EMSGSIZE if the caller's buffer
was smaller than the internal buffer. In this case, all we really care
about is whether the caller's buffer is big enough for however much data
we're going to copy into it. Which needs to take into account the
candidate.
2024-02-26 13:16:27 -07:00
Dan Klishch
ba24e86fdd AK: Introduce IntrusiveBinaryHeap and reimplement BinaryHeap using it
The main difference between them is that IntrusiveBinaryHeap can
optionally maintain an index inside every stored node that allows
arbitrary nodes to be deleted.
2024-02-25 17:24:36 -07:00
Hendiadyoin1
38cb5444d9 AK: Make StringView::for_each_split_view() aware of IterationDecision 2024-02-24 16:43:44 -07:00
Dan Klishch
8ac0e3f0e5 AK+LibJS: Remove null state from DeprecatedFlyString :^) 2024-02-24 15:06:52 -07:00
Dan Klishch
061f902f95 AK+Userland: Introduce ByteString::create_and_overwrite
And replace two users of raw StringImpl with it.
2024-02-24 15:06:52 -07:00
Ali Mohammad Pur
bc301b6f40 AK+LibXML+JSSpecCompiler: Move LineTrackingLexer to AK
This is a simple extension of GenericLexer, and is used in more than
just LibXML, so let's move it into AK.
The move also resolves a FIXME, which is removed in this commit.
2024-02-16 15:26:43 +01:00
Lucas CHOLLET
cbfea68ed8 AK: Add BigEndianInputBitStream::bits_until_next_byte_boundary() 2024-02-12 14:08:56 +01:00
Nico Weber
d84b69ace9 AK: Add to_array()
This is useful if you want an array with an explicit type but still
want its size to be inferred.
2024-02-11 18:53:00 +01:00
Nico Weber
10216e1743 AK: Remove a stray static
No behavior change.
2024-02-11 18:53:00 +01:00
Nico Weber
4409b33145 AK: Make IndexSequence use size_t
This makes it possible to use MakeIndexSequqnce in functions like:

    template<typename T, size_t N>
    constexpr auto foo(T (&a)[N])

This means AK/StdLibExtraDetails.h must now include AK/Types.h
for size_t, which means AK/Types.h can no longer include
AK/StdLibExtras.h (which arguably it shouldn't do anyways),
which requires rejiggering some things.

(IMHO Types.h shouldn't use AK::Details metaprogramming at all.
FlatPtr doesn't necessarily have to use Conditional<> and ssize_t could
maybe be in its own header or something. But since it's tangential to
this PR, going with the tried and true "lift things that cause the
cycle up to the top" approach.)
2024-02-11 18:53:00 +01:00
Tim Ledbetter
4a7236cabf Everywhere: Prefer _string when constructing strings from literals 2024-02-08 11:01:10 -05:00
Dan Klishch
88af15d513 AK: Store JsonValue's value in AK::Variant 2024-02-08 08:04:05 -07:00
Andrew Kaster
bc9c710904 LibWeb: Hide WebDriver::match_route debug behind its own flag
When enabling WEBDRIVER_DEBUG globally, this function's debug spam
overpowers the rest of the useful logs.
2024-02-08 15:53:46 +01:00
Dan Klishch
677bcea771 ntpquery: Use AK::convert_between_host_and_network_endian
Instead of polluting global namespace with definitions from
libkern/OSByteOrder.h and machine/endian.h on MacOS, just use AK
functions for conversions.
2024-02-06 04:37:47 -07:00
vincent-rg
a9df60ff1c AK: Update OptionParser::m_arg_index by substracting skipped args
On argument swapping to put positional ones toward the end,
m_arg_index was pointing at "last arg  index" + "skipped args" +
"consumed args" and thus was pointing ahead of the skipped ones.

m_arg_index now points after the current parsed option arguments.
2024-02-06 00:08:30 +01:00
Dan Klishch
3e43d15440 Everywhere: Prefer VERIFY over assert() 2024-02-05 07:03:53 -05:00
Nico Weber
41f57a5477 AK: Remove the SIMD version of rsqrt() too, for good measure
No strong reason to remove this one, other than that it's also unused.
2024-01-30 10:02:33 +01:00
Nico Weber
a1f70b39fa AK: Remove rsqrt()
At least on arm64, this isn't very preciese:
https://github.com/SerenityOS/serenity/issues/22739#issuecomment-1912909835

It is also now unused.
2024-01-30 10:02:33 +01:00
Shannon Booth
c6319d68c3 AK: Introduce EquivalentFunctionType
This allows you to get the type from a function from some given
callable 'T'.

Co-Authored-By: Ali Mohammad Pur <mpfard@serenityos.org>
2024-01-27 21:40:25 -05:00
Ali Mohammad Pur
0e61d039c9 AK: Use IsSame<FlatPtr, T> instead of __LP64__ to guess FlatPtr's type
Instead of playing the guessing game, simply use whatever type FlatPtr
itself resolves to.
2024-01-28 04:30:33 +03:30
Sam Atkins
388856dc7e AK+Userland: Return String from human_readable_size() functions 2024-01-25 09:07:32 +01:00
Sam Atkins
7e8cfb60eb AK+Userland: Return String from human_readable_[digital_]time() 2024-01-25 09:07:32 +01:00
Dan Klishch
870a947040 AK: Remove StringInternals.h
Since we do not expose memory layout anymore in StringBase, there is no
need to keep StringData public.
2024-01-21 16:16:15 -07:00
Dan Klishch
611adf1591 AK: Make the state of StringBase private
Now it actually only exposes methods to allocate uninitialized storage
and to create substring with a shared superstring. All the details of
the memory layout are fully encapsulated.
2024-01-21 16:16:15 -07:00