Commit graph

3576 commits

Author SHA1 Message Date
Andreas Kling
2b8a920a7c AK: Don't blindly use SipHash as default hash function
Although it has some interesting properties, SipHash is brutally slow
compared to our previous hash function. Since its introduction, it has
been highly visible in every profile of doing anything interesting with
LibJS or LibWeb.

By switching back, we gain a 10x speedup for 32-bit hashes, and "only"
a 3x speedup for 64-bit hashes.

This comes out to roughly 1.10x faster HashTable insertion, and roughly
2.25x faster HashTable lookup. Hashing is no longer at the top of
profiles and everything runs measurably faster.

For security-sensitive hash tables with user-controlled inputs, we can
opt into SipHash selectively on a case-by-case basis. The vast majority
of our uses don't fit that description though.
2024-03-25 12:39:23 +01:00
Timothy Flynn
7e38653492 AK: Reject invalid Base64 encoded string lengths 2024-03-25 08:13:27 +01:00
Timothy Flynn
4ecf4c7617 AK: Compute the exact size of decoded Base64 strings 2024-03-25 08:13:27 +01:00
Timothy Flynn
754ff41b9c AK: Remove whitespace skipping feature from AK's Base64 decoder
This was added in commit f2663f477f as a
partial implementation of what is now LibWeb's forgiving Base64 decoder.
All use cases within LibWeb that require whitespace skipping now use
that implementation instead.

Removing this feature from AK allows us to know the exact output size of
a decoded Base64 string. We can still trim whitespace at the start and
end of the input though; for example, this is useful when reading from a
file that may have a newline at the end of the file.
2024-03-25 08:13:27 +01:00
Timothy Flynn
690db10463 AK: Convert Base64 template parameters to regular function parameters
The generated function name is otherwise very long, which makes stack
traces a bit more difficult to sift through.
2024-03-25 08:13:27 +01:00
Timothy Flynn
f292746134 AK: Convert some west-consts to east-const in Base64.cpp
Caught by clang-format-17. Note that clang-format-16 is fine with this
as well (it leaves the const placement alone), it just doesn't perform
the formatting to east-const itself.
2024-03-25 08:13:27 +01:00
Andreas Kling
3bdfca1119 AK: Make FlyString::from_utf8*() avoid allocation if possible
If we already have a FlyString instantiated for the given string,
look that up and return it instead of making a temporary String just to
use as a key into the FlyString table.
2024-03-24 13:28:24 +01:00
Andreas Kling
8d7a1e5654 LibWeb: Skip some redundant UTF-8 validation in CSS tokenizer
If we're just adding code points to a StringBuilder, there's no need to
revalidate the result.
2024-03-24 13:28:24 +01:00
Andreas Kling
a88799c032 AK: Remove excessive hashing caused by FlyString table
Before this change, the global FlyString table looked like this:

    HashMap<StringView, Detail::StringBase>

After this change, we have:

    HashTable<Detail::StringData const*, FlyStringTableHashTraits>

The custom hash traits are used to extract the stored hash from
StringData which avoids having to rehash the StringView repeatedly like
we did before.

This necessitated a handful of smaller changes to make it work.
2024-03-24 13:28:24 +01:00
Andreas Kling
8bfad24708 AK: Move AK::Detail::StringData to its own header file
This will allow us to access it from FlyString.cpp
2024-03-24 13:28:24 +01:00
Dan Klishch
45a0ba2167 AK: Introduce AK::enumerate
Co-Authored-By: Tim Flynn <trflynn89@pm.me>
2024-03-23 09:02:58 -04:00
Stanisław Wiśniewski
994fe0b89f AK: Use else if constexpr in explode_byte() 2024-03-21 14:35:20 -06:00
Timothy Flynn
81ad6de41b AK: Avoid creating an intermediate buffer when decoding a Base64 string
There's no need to copy the result. We can also avoid increasing the
size of the output buffer by 1 for each written byte.

This reduces the runtime of `./bin/base64 -d enwik8.base64 >/dev/null`
from 0.917s to 0.632s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Timothy Flynn
0fd7ad09a0 AK: Avoid StringBuilder when creating a Base64-encoded string
We don't really need the features provided by StringBuilder here, since
we know the exact size of the output. Avoiding StringBuilder avoids the
recurring capacity/size checks both within StringBuilder itself and its
internal ByteBuffer.

This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
0.976s to 0.428s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Timothy Flynn
5f5b8ee9bb AK: Do not perform UTF-8 validation on Base64-encoded strings
We know we are only appending ASCII characters to the StringBuilder, so
do not bother validating the result.

This reduces the runtime of `./bin/base64 enwik8 >/dev/null` from
1.192s to 0.976s.

(enwik8 is a 100MB test file from http://mattmahoney.net/dc/enwik8.zip)
2024-03-21 15:53:46 +01:00
Andrew Kaster
e9b16970fe AK: Add base64url encoding and decoding methods
This encoding scheme comes from section 5 of RFC 4648, as an
alternative to the standard base64 encode/decode methods.

The only difference is that the last two characters are replaced
with '-' and '_', as '+' and '/' are not safe in URLs or filenames.
2024-03-20 12:18:57 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Andreas Kling
6724f840cd AK: Early return from empty hash table lookups to avoid hashing
When calling get() or find() on an empty HashTable or HashMap, we can
avoid hashing the sought-after key.
2024-03-16 14:27:59 +01:00
Timothy Flynn
e4213f5767 AK: Generalize Span::contains_slow to use the Traits infrastructure
This allows, for example, checking if a Span<String> contains a value
without having to allocate a String.
2024-03-16 08:42:33 +01:00
Timothy Flynn
faf4ba63c2 AK: Don't use east-constexpr in Span methods 2024-03-16 08:42:33 +01:00
Ali Mohammad Pur
d451f84f31 LibCrypto: Add a minimal DER encoder
Progress towards #23562.
2024-03-16 01:17:02 -06:00
Andreas Kling
d125a76f85 AK: Make FlyString-to-FlyString comparison inline & trivial
This should never boil down to more than a machine word comparison.
2024-03-14 12:42:08 +01:00
Ali Mohammad Pur
8003bde03d AK+LibRegex+LibWasm: Remove the non-const COWVector::operator[]
This was copying the vector behind our backs, let's remove it and make
the copying explicit by putting it behind COWVector::mutable_at().
This is a further 64% performance improvement on Wasm validation.
2024-03-12 17:10:47 +01:00
Ali Mohammad Pur
cefe177a56 AK+LibRegex: Move COWVector to AK
This is about to gain a new user, so move it to AK.
2024-03-12 17:10:47 +01:00
Timothy Flynn
e3b5e24ce0 AK: Iterate the bytes of a URL query with an unsigned type
Otherwise, we percent-encode negative signed chars incorrectly. For
example, https://www.strava.com/login contains the following hidden
<input> field:

    <input name="utf8" type="hidden" value="✓" />

On submitting the form, we would percent-encode that field as:

    utf8=%-1E%-64%-6D

Which would cause us to receive an HTTP 500 response. We now properly
percent-encode that field as:

    utf8=%E2%9C%93

And can login to Strava :^)
2024-03-10 15:17:31 +01:00
Nico Weber
58838db445 LibGfx: Add the start of a JBIG2 loader
JBIG2 is infamous for two things:

1. It's used in xerox scanners were it falsifies scanned numbers:

https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning

2. It was allegedly used in an iOS zero day, in a very cool way:

https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html

Needless to say, we need support for it in Serenity. (...because it's
used in PDF files.)

This adds all the scaffolding, but no actual implementation yet.

It's enough for `file` to print the mime type of .jb2 files, but `image`
can't do anything with the files yet.
2024-03-09 16:01:22 +01:00
Timothy Flynn
82ea53cf10 AK: Add a StringView method to count the number of lines in a string
We already have a helper to split a StringView by line while considering
"\n", "\r", and "\r\n". Add an analagous method to just count the number
of lines in the same manner.
2024-03-08 14:43:33 -05:00
Timothy Flynn
07a27b2ec0 AK: Replace the boolean parameter of StringView::lines with a named enum 2024-03-08 14:43:33 -05:00
Matthew Olsson
a511f1ef85 AK: Add HashMap::ensure_capacity 2024-03-06 07:45:56 +01:00
Filiph Siitam Sandström
fd694e8672 AK+Lagom: Make it possible to build for iOS
This commit makes it possible to build AK and most of Lagom for iOS,
based on the work for the Ladybird build demoed on discord:
https://discord.com/channels/830522505605283862/830525031720943627/1211987732646068314
2024-03-03 13:13:42 -07:00
Hendiadyoin1
79fd8eb28d AK/HashMap: Use structured bindings when iterating over itself 2024-03-01 14:05:53 -07:00
Nico Weber
f8b8d1b3be AK: Add is_ascii_uppercase_hex_digit() 2024-03-01 14:17:42 +01:00
Timothy Flynn
d878975f95 AK+LibJS: Remove OFFSET_OF and its users
With the LibJS JIT removed, let's not expose pointers to internal
members.
2024-02-29 09:00:00 +01:00
Andrew Kaster
21ac431fac AK: Allow reading from EOF buffered streams better in read_line()
If the BufferedStream is able to fill its entire circular buffer in
populate_read_buffer() and is later asked to read a line or read until
a delimiter, it could erroneously return EMSGSIZE if the caller's buffer
was smaller than the internal buffer. In this case, all we really care
about is whether the caller's buffer is big enough for however much data
we're going to copy into it. Which needs to take into account the
candidate.
2024-02-26 13:16:27 -07:00
Dan Klishch
ba24e86fdd AK: Introduce IntrusiveBinaryHeap and reimplement BinaryHeap using it
The main difference between them is that IntrusiveBinaryHeap can
optionally maintain an index inside every stored node that allows
arbitrary nodes to be deleted.
2024-02-25 17:24:36 -07:00
Hendiadyoin1
38cb5444d9 AK: Make StringView::for_each_split_view() aware of IterationDecision 2024-02-24 16:43:44 -07:00
Dan Klishch
8ac0e3f0e5 AK+LibJS: Remove null state from DeprecatedFlyString :^) 2024-02-24 15:06:52 -07:00
Dan Klishch
061f902f95 AK+Userland: Introduce ByteString::create_and_overwrite
And replace two users of raw StringImpl with it.
2024-02-24 15:06:52 -07:00
Ali Mohammad Pur
bc301b6f40 AK+LibXML+JSSpecCompiler: Move LineTrackingLexer to AK
This is a simple extension of GenericLexer, and is used in more than
just LibXML, so let's move it into AK.
The move also resolves a FIXME, which is removed in this commit.
2024-02-16 15:26:43 +01:00
Lucas CHOLLET
cbfea68ed8 AK: Add BigEndianInputBitStream::bits_until_next_byte_boundary() 2024-02-12 14:08:56 +01:00
Nico Weber
d84b69ace9 AK: Add to_array()
This is useful if you want an array with an explicit type but still
want its size to be inferred.
2024-02-11 18:53:00 +01:00
Nico Weber
10216e1743 AK: Remove a stray static
No behavior change.
2024-02-11 18:53:00 +01:00
Nico Weber
4409b33145 AK: Make IndexSequence use size_t
This makes it possible to use MakeIndexSequqnce in functions like:

    template<typename T, size_t N>
    constexpr auto foo(T (&a)[N])

This means AK/StdLibExtraDetails.h must now include AK/Types.h
for size_t, which means AK/Types.h can no longer include
AK/StdLibExtras.h (which arguably it shouldn't do anyways),
which requires rejiggering some things.

(IMHO Types.h shouldn't use AK::Details metaprogramming at all.
FlatPtr doesn't necessarily have to use Conditional<> and ssize_t could
maybe be in its own header or something. But since it's tangential to
this PR, going with the tried and true "lift things that cause the
cycle up to the top" approach.)
2024-02-11 18:53:00 +01:00
Tim Ledbetter
4a7236cabf Everywhere: Prefer _string when constructing strings from literals 2024-02-08 11:01:10 -05:00
Dan Klishch
88af15d513 AK: Store JsonValue's value in AK::Variant 2024-02-08 08:04:05 -07:00
Andrew Kaster
bc9c710904 LibWeb: Hide WebDriver::match_route debug behind its own flag
When enabling WEBDRIVER_DEBUG globally, this function's debug spam
overpowers the rest of the useful logs.
2024-02-08 15:53:46 +01:00
Dan Klishch
677bcea771 ntpquery: Use AK::convert_between_host_and_network_endian
Instead of polluting global namespace with definitions from
libkern/OSByteOrder.h and machine/endian.h on MacOS, just use AK
functions for conversions.
2024-02-06 04:37:47 -07:00
vincent-rg
a9df60ff1c AK: Update OptionParser::m_arg_index by substracting skipped args
On argument swapping to put positional ones toward the end,
m_arg_index was pointing at "last arg  index" + "skipped args" +
"consumed args" and thus was pointing ahead of the skipped ones.

m_arg_index now points after the current parsed option arguments.
2024-02-06 00:08:30 +01:00
Dan Klishch
3e43d15440 Everywhere: Prefer VERIFY over assert() 2024-02-05 07:03:53 -05:00
Nico Weber
41f57a5477 AK: Remove the SIMD version of rsqrt() too, for good measure
No strong reason to remove this one, other than that it's also unused.
2024-01-30 10:02:33 +01:00
Nico Weber
a1f70b39fa AK: Remove rsqrt()
At least on arm64, this isn't very preciese:
https://github.com/SerenityOS/serenity/issues/22739#issuecomment-1912909835

It is also now unused.
2024-01-30 10:02:33 +01:00
Shannon Booth
c6319d68c3 AK: Introduce EquivalentFunctionType
This allows you to get the type from a function from some given
callable 'T'.

Co-Authored-By: Ali Mohammad Pur <mpfard@serenityos.org>
2024-01-27 21:40:25 -05:00
Ali Mohammad Pur
0e61d039c9 AK: Use IsSame<FlatPtr, T> instead of __LP64__ to guess FlatPtr's type
Instead of playing the guessing game, simply use whatever type FlatPtr
itself resolves to.
2024-01-28 04:30:33 +03:30
Sam Atkins
388856dc7e AK+Userland: Return String from human_readable_size() functions 2024-01-25 09:07:32 +01:00
Sam Atkins
7e8cfb60eb AK+Userland: Return String from human_readable_[digital_]time() 2024-01-25 09:07:32 +01:00
Dan Klishch
870a947040 AK: Remove StringInternals.h
Since we do not expose memory layout anymore in StringBase, there is no
need to keep StringData public.
2024-01-21 16:16:15 -07:00
Dan Klishch
611adf1591 AK: Make the state of StringBase private
Now it actually only exposes methods to allocate uninitialized storage
and to create substring with a shared superstring. All the details of
the memory layout are fully encapsulated.
2024-01-21 16:16:15 -07:00
Dan Klishch
fa52f68142 AK: Store data in FlyString as StringBase
Unfortunately, it is not clear to me how to split this commit into
several atomic ones.
2024-01-21 16:16:15 -07:00
Dan Klishch
e7700e16ee AK: Forward substring creation with shared superstring to StringBase 2024-01-21 16:16:15 -07:00
Dan Klishch
5d6cd65e29 AK: Simplify String::repeated by leveraging StringBase helpers 2024-01-21 16:16:15 -07:00
Dan Klishch
7dbe357e9f AK: Simplify String::from_stream by leveraging StringBase helpers 2024-01-21 16:16:15 -07:00
Dan Klishch
7506736869 AK: Stop using ShortString in String::from_code_point
Refactor it to use StringBase::replace_with_new_short_string instead.
2024-01-21 16:16:15 -07:00
Dan Klishch
dcd1fda9c8 AK: Introduce StringBase::replace_with_new_{short_,}string 2024-01-21 16:16:15 -07:00
Dan Klishch
d6290c4684 AK: Move String::hash() and String::String() to StringBase 2024-01-21 16:16:15 -07:00
Dan Klishch
1b09a1851e AK: Move String::~String() and String::destroy_string() to StringBase 2024-01-21 16:16:15 -07:00
Dan Klishch
54d149bc25 AK: Move String::bytes() and String::operator==(String) to StringBase
The idea is to eventually get rid of protected state in StringBase. To
do this, we first need to remove all references to m_data and
m_short_string from String.
2024-01-21 16:16:15 -07:00
Dan Klishch
4364a28d3d AK: Move data fields from AK::String to a newly created AK::StringBase
This starts separating memory management of string data and string
utilities like `String::formatted`. This would also allow to reuse the
same storage in `DeprecatedString` in the future.
2024-01-21 16:16:15 -07:00
Dan Klishch
6e2f627cb3 AK: Move StringData from String.cpp to a newly created StringInternals.h
This is done to allow using it in files other than AK/String.cpp.
2024-01-21 16:16:15 -07:00
Dan Klishch
855ea192be AK: Add AK_MAKE_DEFAULT_COPYABLE 2024-01-21 16:16:15 -07:00
Dan Klishch
7f8d69ee2f AK: Remove explicit String::operator!= in favor of defaulted one 2024-01-21 16:16:15 -07:00
Dan Klishch
b5f1a48a7c AK+Everywhere: Remove JsonValue APIs with implicit default values 2024-01-21 15:47:53 -07:00
Dan Klishch
c49819cced AK+GMLCompiler+LibWeb: Remove JsonValue::is_double
This concludes a series of patches which remove the ability to observe
which arithmetic type is used to store number in JsonValue.
2024-01-21 15:47:53 -07:00
Dan Klishch
faef802229 AK+GMLCompiler: Remove JsonValue::as_double()
Replace its single (non-test) usage with newly created as_number(),
which does not leak information about internal integer storage type.
2024-01-21 15:47:53 -07:00
Dan Klishch
5230d2af91 AK+WebContent: Remove JsonValue::as_{i,u}{32,64}() 2024-01-21 15:47:53 -07:00
Ali Mohammad Pur
4f6c9f410c AK+LibCore: Add BufferedSocket::can_read_up_to_delimiter()
This method (unlike can_read_line) ensures that the delimiter is present
in the buffer, and doesn't return true after eof when the delimiter is
absent.
2024-01-21 21:13:58 +01:00
Ali Mohammad Pur
4d1d88aa16 AK: Make the :hex-dump format specifier print all characters
Previously the final line would be skipped if it was not a multiple of
|width|, this makes the character view show up for that line.
2024-01-21 21:13:58 +01:00
Tim Ledbetter
65827826fe AK: Add CharacterTypes::is_ascii_base36_digit()
This can be used to validate the string passed to
`parse_ascii_base36_digit()`.
2024-01-13 19:01:35 -07:00
Dan Klishch
ccd701809f Everywhere: Add deprecated_ prefix to JsonValue::to_byte_string
`JsonValue::to_byte_string` has peculiar type-erasure semantics which is
not usually intended. Unfortunately, it also has a very stereotypical
name which does not warn about unexpected behavior. So let's prefix it
with `deprecated_` to make new code use `as_string` if it just wants to
get string value or `serialized<StringBuilder>` if it needs to do proper
serialization.
2024-01-12 17:41:34 -07:00
kleines Filmröllchen
eada4f2ee8 AK: Remove ByteString from GenericLexer
A bunch of users used consume_specific with a constant ByteString
literal, which can be replaced by an allocation-free StringView literal.

The generic consume_while overload gains a requires clause so that
consume_specific("abc") causes a more understandable and actionable
error.
2024-01-12 17:03:53 -07:00
Martin Janiczek
5a8781393a AK: Cover TestComplex with more tests
Related:
- video detailing the process of writing these tests: https://www.youtube.com/watch?v=enxglLlALvI
- PR fixing bugs the above effort found: https://github.com/SerenityOS/serenity/pull/22025
2024-01-12 16:42:51 -07:00
Martin Janiczek
d52ffcd830 LibTest: Add more numeric generators
Rename unsigned_int generator to number_u32.
Add generators:
- number_u64
- number_f64
- percentage
2024-01-12 16:42:51 -07:00
Andrew Kaster
09ce32039f AK: Use cast to const void pointer in to_readonly_span helper
This lets developers actually hex-dump print `Span<T const>` using the
helper as intended.
2024-01-06 10:13:14 +01:00
Timothy Flynn
cae184d7cf AK: Improve performance of StringUtils::find_last
The current algorithm is currently O(N^2) because we forward-search an
ever-increasing substring of the haystack. This implementation reduces
the search time of a 500,000-length string (where the desired needle is
at index 0) from 72 seconds to 2-3 milliseconds.
2024-01-04 11:28:03 -05:00
Timothy Flynn
9cab4958e6 AK: Convert a couple String-related declarations to east-const
Caught by clang-format-17. Note that clang-format-16 is fine with this
as well (it leaves the const placement alone), it just doesn't perform
the formatting to east-const itself.
2024-01-04 11:28:03 -05:00
Timothy Flynn
1b4a23095c AK: Add a Utf16View::starts_with method
Based heavily on Utf8View::starts_with.
2024-01-04 12:43:10 +01:00
Timothy Flynn
c46ba7e68d AK: Allow constructing a UTF-16 view from a UTF-16 string literal
UTF-16 string literals are a language-level feature. It is convenient to
be able to construct a Utf16View from these strings.
2024-01-04 12:43:10 +01:00
Aliaksandr Kalenik
e394971209 AK+LibWeb: Use segmented vector to store commands in RecordingPainter
Using a vector to represent a list of painting commands results in many
reallocations, especially on pages with a lot of content.

This change addresses it by introducing a SegmentedVector, which allows
fast appending by representing a list as a sequence of fixed-size
vectors. Currently, this new data structure supports only the
operations used in RecordingPainter, which are appending and iterating.
2023-12-30 23:02:46 +01:00
Andreas Kling
7ad7ae7000 AK: Check URL parser input for invalid (tabs or spaces) in 1 pass
Combine 2 passes into 1 by iterating over the input once and checking
for both '\t' and '\n'.
2023-12-30 13:49:50 +01:00
Andreas Kling
a19d8a4a37 AK: Add ASCII fast path to Utf8CodePointIterator
Much of the UTF-8 data that we'll iterate over will be ASCII only,
and we can get a significant speed-up by simply having a fast path
when the iterator points at a byte that is obviously an ASCII character
(<= 0x7F).
2023-12-30 13:49:50 +01:00
Andreas Kling
75cecd19a5 AK: Skip UTF-8 validation inside URL parser
Since we're already building up a percent-encoded ASCII-only string
in the internal parser buffer, there's no need to do a second UTF-8
validation pass before assigning each part of the parsed URL.

This makes URL parsing signficantly faster.
2023-12-30 13:49:50 +01:00
Andreas Kling
f045a877b4 AK: Implement StringBuilder::append_code_point() more efficiently
Instead of do a wrappy MUST(try_append_code_point()), we now inline
the UTF-8 encoding logic. This allows us to grow the buffer by the
right increment up front, and also removes a bunch of ErrorOr ceremony
that we don't care about.
2023-12-30 13:49:50 +01:00
Andreas Kling
bacbc376a0 AK: Make StringView::contains(StringView) faster for 1-byte needles
If we're looking for a 1-byte string, we can do the much simpler byte
scan by simply forwarding the call to StringView::contains(char).
2023-12-30 13:49:50 +01:00
Andreas Kling
6c51ba27a2 AK: Make URL percent encoding faster by exploiting ASCII knowledge
Once we know that a code point must be a valid ASCII character,
we now cast it to `char` and avoid the expensive generic
StringView::contains(u32 code_point) checks.

This dramatically speeds up URL parsing.
2023-12-30 13:49:50 +01:00
Andreas Kling
3c039903fb LibTextCodec+AK: Don't validate UTF-8 strings twice
UTF8Decoder was already converting invalid data into replacement
characters while converting, so we know for sure we have valid UTF-8
by the time conversion is finished.

This patch adds a new StringBuilder::to_string_without_validation()
and uses it to make UTF8Decoder avoid half the work it was doing.
2023-12-30 13:49:50 +01:00
Andreas Kling
a285e36041 LibJS+AK: Make String.prototype.repeat() way faster
Instead of using a StringBuilder, add a String::repeated(String, N)
overload that takes advantage of knowing it's already all UTF-8.

This makes the following microbenchmark go 4x faster:

    "foo".repeat(100_000_000)

And for single character strings, we can even go 10x faster:

    "x".repeat(100_000_000)
2023-12-30 13:49:50 +01:00
Andrew Kaster
053e4d5e64 AK: Only try to print gettid() in dbgln on Linux and Serenity
On every other Unix, the relationship between thread id and process id
is not nearly as direct.
2023-12-29 09:46:50 +01:00
Timothy Flynn
507a5d8a07 AK: Add an option to zero-fill ByteBuffer data upon growth
This is to avoid UB in cases where we need to be able to read from the
buffer immediately after resizing it.
2023-12-27 19:30:39 +01:00
Shannon Booth
d51f84501a AK: Remove now unused to_{int,uint,float,double} String functions 2023-12-23 20:41:07 +01:00
Shannon Booth
e2e7c4d574 Everywhere: Use to_number<T> instead of to_{int,uint,float,double}
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:

```
Optional<I> opt;
if constexpr (IsSigned<I>)
    opt = view.to_int<I>();
else
    opt = view.to_uint<I>();
```

For us.

The main goal here however is to have a single generic number conversion
API between all of the String classes.
2023-12-23 20:41:07 +01:00
Shannon Booth
a4ecc65398 AK: Add DeprecatedFlyString::to_number<T> 2023-12-23 20:41:07 +01:00
Shannon Booth
159eda5c6d AK: Add ByteString::to_number<T>
To mirror the API with StringView and String.
2023-12-23 20:41:07 +01:00
Shannon Booth
cdf84a3e36 AK: Implement StringView::to_number<T> from String::to_number<T>
Do exactly what String does, then use StringView's implementation as
String's new one. This should allow us to call to_number on a
StringView.
2023-12-23 20:41:07 +01:00
Ali Mohammad Pur
64616d3997 AK: Completely disable rich debug formats on Windows
Half the functions used are not readily available on windows, instead of
creating more ifdef soup, this commit simply disables the rich debug
stuff on windows.
2023-12-22 10:59:21 +01:00
Andreas Kling
a264cf79c4 AK: Use fallback builtins for overflow checks in AK::Checked
If we don't have __builtin_add_overflow_p(), we can also try using
__builtin_add_overflow(). This makes debug builds with Clang
significantly faster since they no longer need to use the generic
implementation. Same for multiplication.
2023-12-21 15:31:32 +01:00
Andreas Kling
9f0aa08468 AK: Add ByteString::from_utf8_without_validation()
This will be used by Jakt to create ByteString from string literals
which we can validate at compile time instead of runtime. :^)
2023-12-21 13:49:41 +01:00
Andreas Kling
b27a62488c AK: Add ByteString::must_from_utf8(StringView) for Jakt 2023-12-18 12:41:25 +01:00
Jesús "gsus" Lapastora
7578620f25 AK/StringUtils: Ensure needle positions don't overlap in replace
Previously, `replace` used `find_all` to find all of the positions to
replace. But `find_all` finds all the *overlapping* instances of the
needle, while `replace` assumed that the next position was always at
least `needle.length()` away from the last one. This led to crashes like
https://github.com/SerenityOS/jakt/issues/1159.
2023-12-17 12:00:48 -07:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Tim Schumacher
e2d4952f0f AK: Add Array::from_repeated_value() 2023-12-14 08:59:23 -07:00
Andrew Kaster
4db5e2ba22 AK: Print timestamp, process name, and pid on all platforms
This requires duplicating some logic from Core::Process::get_name()
into AK, which seems unfortunate. But for now, this greatly improves the
log messages for testing Ladybird on Linux.

The feature is hidden behind a runtime flag with a global setter in the
same way that totally enabling/disabling dbgln is.
2023-12-12 10:11:24 -07:00
Simon Wanner
58f08107b0 AK+LibUnicode: Add Unicode::create_unicode_url
This is a workaround for the fact that AK::URLParser can't call into
LibUnicode directly.
2023-12-10 08:04:58 -05:00
Shannon Booth
73f7f33205 AK: Disallow calling FlyString::from_utf8 on FlyString and String 2023-12-10 09:45:03 +01:00
Shannon Booth
5f2f26451d AK: Disallow String::from_utf8 on FlyString and String 2023-12-10 09:45:03 +01:00
implicitfield
2de582afc4 AK: Make ByteBuffer's trim helper public 2023-12-08 22:05:43 +03:30
Bastiaan van der Plaat
4a7d3115c9 AK: Add String to number floating point support 2023-12-04 19:54:43 +00:00
Tim Schumacher
e9e89d7e4e AK: Optimize BitStream refilling a bit further
This tries to optimize the refill code by making it easier to digest for
the branch predictor. This includes not looping as much across function
calls and marking our EOF case to be unlikely.

Co-Authored-By: Lucas Chollet <lucas.chollet@free.fr>
2023-12-01 12:48:18 +01:00
Tim Schumacher
197331c922 AK: Reject BitStream reads beyond EOF by default
The only exception to this is the lossless WebP decoder, which
legitimately relies on this behavior, even upstream.
2023-12-01 12:48:18 +01:00
Tim Schumacher
cb03d3d78f AK: Allow rejecting BitStream reads beyond EOF 2023-12-01 12:48:18 +01:00
Tim Schumacher
de49413bdf AK: Don't slice off the first byte of a BitStream read 2023-12-01 12:48:18 +01:00
Lucas CHOLLET
aaf54f8cf8 AK: Allow Optional<T&> to be constructed by OptionalNone()
This is an extension of cc0b970d but for the reference-handling
specialization of Optional.

This basically allow us to write code like:
```cpp
Optional<u8&> opt;
opt = OptionalNone{};
```
2023-11-29 02:19:41 +03:30
Shannon Booth
6b32a1f18f AK+LibUnicode: Expose TrailingCodePointTransformation in to_titlecase
Relocating the definition of this enum from LibUnicode to AK.
2023-11-28 17:15:27 -05:00
Timothy Flynn
6aa334767f AK: Ensure assigned-to Strings are dereferenced if needed
If we assign to an existing non-short string, we must dereference its
StringData object to prevent leaking that data.
2023-11-28 16:38:18 +01:00
Michiel Visser
51fe8f820f AK: Fix compile error when using div_mod_internal<513, 256, true> 2023-11-27 09:43:07 +03:30
Dan Klishch
80d1c93edf AK+Applications: Return value from JsonObject::get_double more often
Previously, we were returning an empty optional if key contained a
numerical value which was not stored as double. Stop doing that and
rename the method to signify the change in the behavior.

Apparently, this fixes bug in an InspectorWidget in Ladybird on
Serenity: it showed 0 for element's boxes with integer sizes.
2023-11-25 11:02:17 +01:00
Andreas Kling
a6106ca221 AK: Use __builtin_offsetof() + -Wno-invalid-offsetof to silence ASAN
ASAN was crying way too much when running the LibJS JIT since the old
OFFSET_OF implementation was too wild for its liking.

By turning off the invalid-offsetof warnings, we can use the offsetof
builtin instead. However, I'm leaving this as a wrapper macro, since
we may still want to do something different for other compilers.
2023-11-24 12:49:15 +01:00
timmot
da3cfd5bbc AK+LibWeb: Make clamp_to_int generic over all integrals 2023-11-24 08:42:18 +01:00
Andrew Kaster
bbdf766fb0 AK: Add helpers to convert arbitrary Spans to {Readonly}Bytes
The streams and other common APIs require byte spans to operate on
arbitrary data. This is less than helpful when wanting to serialize
spans of other data types, such as from an Array or Vector of u32s.
2023-11-24 08:41:38 +01:00
Martin Janiczek
58d0577a02 AK: Fix bugs in Complex += -= + - * / operators
There were two issues:

1) the C+=R and C-=R operators expected arithmetic types to have .real()

2) the R+C, R-C, R*C and R/C operators applied the operation in wrong
   order (did C+R, C-R, C*R and C/R instead). This wouldn't matter for
   + and * which are commutative, but is incorrect for - and /.
2023-11-23 19:54:39 -05:00
MacDue
da00a5cdb5 AK: Add is_owned() method to MaybeOwned 2023-11-18 19:32:31 +01:00
Timothy Flynn
2c1bbf5a99 AK+LibIDL: Put IDL dbgln statement behind a debug flag
This is a bit spammy now that we are performing some overload resolution
at build time. The fallback to an interface has generally worked fine on
the types it warns about (BufferSource, Module, etc.) so let's not warn
about it for every build.
2023-11-15 23:42:53 +01:00
Dan Klishch
c0ffff7e88 AK: Ban JsonValue from the kernel and remove ifdef guards
JsonValue can store JsonObject which uses DS for keys, so it is not safe
to use it in the kernel even with the double/String guards.
2023-11-14 10:06:54 +01:00
Lucas CHOLLET
86ee7d219e LibCompress/LZW: Use its own debug flag
The file still used the `GIF_DEBUG` flag from when it was a part of the
GIF decoder. Let's give `LZWDecoder` its own flag.
2023-11-12 13:56:27 +01:00
Lucas CHOLLET
6f059c9d60 AK: Add the InputBitStream concept
This will allow users to abstract away the endianness of the stream they
are using.
2023-11-12 13:56:27 +01:00
Michiel Visser
be68f747b6 AK: Add shorthands for u384, u768, and u1536 2023-11-11 14:40:10 +03:30
Nico Weber
bda162fc0d AK: Add Span::reverse()
It works like Vector::reverse().
2023-11-09 16:06:25 +01:00
Tim Schumacher
e9dda2a5f8 AK: Provide a default set of Traits for const types 2023-11-09 10:05:51 -05:00
Tim Schumacher
a2f60911fe AK: Rename GenericTraits to DefaultTraits
This feels like a more fitting name for something that provides the
default values for Traits.
2023-11-09 10:05:51 -05:00
Andreas Kling
55e467c359 LibJS/JIT: Add fast path for cached PutById 2023-11-09 16:02:14 +01:00
Timothy Flynn
e576bf975c AK: Define traits for the const-variant of BigEndian and LittleEndian 2023-11-08 22:26:36 +00:00
Timothy Flynn
370ea9441c AK: Define an alias for Utf16View's iterator type
Utf8View and Utf32View do so already. This allows using these views more
readily in generic code.
2023-11-08 12:54:26 -05:00
Lucas CHOLLET
b00476abac AK: Use an enum to specify the open mode instead of a bool
Let's replace this bool with an `enum class` in order to enhance
readability. This is done by repurposing `MappedFile`'s `OpenMode` into
a shared `enum` simply called `Mode`.
2023-11-08 18:19:34 +01:00
Sam Atkins
1519290989 AK: Cast pointer in FixedMemoryStream::read_in_place(count)
I didn't notice this before because I only ever called it with u8. Oops!
2023-11-08 09:34:09 +01:00
Timothy Flynn
2437064820 AK: Define compound subtraction operator for UnixDateTime 2023-11-08 09:28:17 +01:00
Shannon Booth
8c8ea86729 AK: Add FlyString::starts_with_bytes and FlyString::ends_with_bytes
Mirroring the API for String
2023-11-07 11:33:41 +01:00
Andreas Kling
0bbf230e4f AK: Expose the memory offset of Vector's outline buffer pointer 2023-11-07 11:33:04 +01:00
Andreas Kling
bdce36dddb AK: Expose memory offset of Optional's internal fields 2023-11-07 11:33:04 +01:00
Andreas Kling
af5fd99ff4 AK: Add OFFSET_OF macro that works on class member fields 2023-11-07 11:33:04 +01:00
Lucas CHOLLET
75caccafa4 LibGfx: Add a TIFF loader 2023-11-06 12:29:30 -07:00
Tim Ledbetter
2a1fc96650 AK: Avoid unnecessary String allocations for URL username and password
Previously, `URLParser` was constructing a new String for every
character of the URL's username and password. This change improves
performance by eliminating those unnecessary String allocations.

A URL with a 100,000 character password can now be parsed in ~30ms vs
~8 seconds previously on my machine.
2023-11-06 09:19:12 +01:00
Andreas Kling
0902f552a3 AK: Bring some missing DeprecatedString API over to String
Specifically, case sensitivity parameters for starts/ends with,
and the equals_ignoring_ascii_case() helper.
2023-11-04 21:28:30 +01:00