beenull/ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-22 07:30:19 +00:00

Author	SHA1	Message	Date
Andrew Kaster	45301e8169	Everywhere: Remove AK_DONT_REPLACE_STD macro Let's just always include `<utility>`. Placing our own incompatible with the STL declaration of these functions in AK was always fishy to begin with.	2024-07-30 18:38:02 -06:00
Timothy Flynn	74d644a216	AK: Explicitly check for null data in Utf16View The underlying CPU-specific instructions for operating on UTF-16 strings behave differently for null inputs. Add an explicit check for this state for consistency.	2024-07-21 19:57:07 +02:00
Timothy Flynn	29879a69a4	AK: Construct Strings from StringBuilder without re-allocating the data Currently, invoking StringBuilder::to_string will re-allocate the string data to construct the String. This is wasteful both in terms of memory and speed. The goal here is to simply hand the string buffer over to String, and let String take ownership of that buffer. To do this, StringBuilder must have the same memory layout as Detail::StringData. This layout is just the members of the StringData class followed by the string itself. So when a StringBuilder is created, we reserve sizeof(StringData) bytes at the front of the buffer. StringData can then construct itself into the buffer with placement new. Things to note: * StringData must now be aware of the actual capacity of its buffer, as that can be larger than the string size. * We must take care not to pass ownership of inlined string buffers, as these live on the stack.	2024-07-20 06:45:49 +02:00
Timothy Flynn	71c29504af	AK: Support non-native endianness in Utf16View Utf16View currently assumes host endianness. Add support for specifying either big or little endianness (which we mostly just pipe through to simdutf). This will allow using simdutf facilities with LibTextCodec.	2024-07-18 19:43:57 +02:00
Timothy Flynn	0c14a9417a	AK: Replace converting to and from UTF-16 with simdutf The one behavior difference is that we will now actually fail on invalid code units with Utf16View::to_utf8(AllowInvalidCodeUnits::No). It was arguably a bug that this wasn't already the case.	2024-07-18 14:46:25 +02:00
Andreas Kling	ebe6ec6069	AK: Check for u32 overflow in String::repeated() I don't know why this was checking for size_t overflow, but it was tripping up ASAN malloc() checks by passing a way-too-large size.	2024-05-07 09:15:40 +02:00
Jess	ecb7d4b40f	LibJS: Throw RangeError in `StringPrototype::repeat` if OOM currently crashes with an assertion failure in `String::repeated` if malloc can't serve a `count * input_size` sized request, so add `String::repeated_with_error` to propagate the error.	2024-04-20 19:23:46 -04:00
Timothy Flynn	de80f544d8	AK: Disallow calling String methods that return a view on rvalues This prevents, for example: StringView view = "foo"_string.bytes_as_string_view(); This prevents a class of potential UAF.	2024-04-04 11:23:21 +02:00
Dan Klishch	870a947040	AK: Remove StringInternals.h Since we do not expose memory layout anymore in StringBase, there is no need to keep StringData public.	2024-01-21 16:16:15 -07:00
Dan Klishch	fa52f68142	AK: Store data in FlyString as StringBase Unfortunately, it is not clear to me how to split this commit into several atomic ones.	2024-01-21 16:16:15 -07:00
Dan Klishch	e7700e16ee	AK: Forward substring creation with shared superstring to StringBase	2024-01-21 16:16:15 -07:00
Dan Klishch	5d6cd65e29	AK: Simplify String::repeated by leveraging StringBase helpers	2024-01-21 16:16:15 -07:00
Dan Klishch	7dbe357e9f	AK: Simplify String::from_stream by leveraging StringBase helpers	2024-01-21 16:16:15 -07:00
Dan Klishch	dcd1fda9c8	AK: Introduce StringBase::replace_with_new_{short_,}string	2024-01-21 16:16:15 -07:00
Dan Klishch	d6290c4684	AK: Move String::hash() and String::String() to StringBase	2024-01-21 16:16:15 -07:00
Dan Klishch	1b09a1851e	AK: Move String::~String() and String::destroy_string() to StringBase	2024-01-21 16:16:15 -07:00
Dan Klishch	54d149bc25	AK: Move String::bytes() and String::operator==(String) to StringBase The idea is to eventually get rid of protected state in StringBase. To do this, we first need to remove all references to m_data and m_short_string from String.	2024-01-21 16:16:15 -07:00
Dan Klishch	4364a28d3d	AK: Move data fields from AK::String to a newly created AK::StringBase This starts separating memory management of string data and string utilities like `String::formatted`. This would also allow to reuse the same storage in `DeprecatedString` in the future.	2024-01-21 16:16:15 -07:00
Dan Klishch	6e2f627cb3	AK: Move StringData from String.cpp to a newly created StringInternals.h This is done to allow using it in files other than AK/String.cpp.	2024-01-21 16:16:15 -07:00
Andreas Kling	3c039903fb	LibTextCodec+AK: Don't validate UTF-8 strings twice UTF8Decoder was already converting invalid data into replacement characters while converting, so we know for sure we have valid UTF-8 by the time conversion is finished. This patch adds a new StringBuilder::to_string_without_validation() and uses it to make UTF8Decoder avoid half the work it was doing.	2023-12-30 13:49:50 +01:00
Andreas Kling	a285e36041	LibJS+AK: Make String.prototype.repeat() way faster Instead of using a StringBuilder, add a String::repeated(String, N) overload that takes advantage of knowing it's already all UTF-8. This makes the following microbenchmark go 4x faster: "foo".repeat(100_000_000) And for single character strings, we can even go 10x faster: "x".repeat(100_000_000)	2023-12-30 13:49:50 +01:00
Ali Mohammad Pur	5e1499d104	Everywhere: Rename {Deprecated => Byte}String This commit un-deprecates DeprecatedString, and repurposes it as a byte string. As the null state has already been removed, there are no other particularly hairy blockers in repurposing this type as a byte string (what it _really_ is). This commit is auto-generated: $ xs=$(ack -l \bDeprecatedString\b\\|deprecated_string AK Userland \ Meta Ports Ladybird Tests Kernel) $ perl -pie 's/\bDeprecatedString\b/ByteString/g; s/deprecated_string/byte_string/g' $xs $ clang-format --style=file -i \ $(git diff --name-only \| grep \.cpp\\|\.h) $ gn format $(git ls-files '.gn' '.gni')	2023-12-17 18:25:10 +03:30
Timothy Flynn	6aa334767f	AK: Ensure assigned-to Strings are dereferenced if needed If we assign to an existing non-short string, we must dereference its StringData object to prevent leaking that data.	2023-11-28 16:38:18 +01:00
Andreas Kling	0902f552a3	AK: Bring some missing DeprecatedString API over to String Specifically, case sensitivity parameters for starts/ends with, and the equals_ignoring_ascii_case() helper.	2023-11-04 21:28:30 +01:00
Andreas Kling	1e820385d9	AK: Add case-insensitive hashing for the new String classes Bringing over this functionality from DeprecatedString.	2023-09-06 11:29:03 -04:00
aryanbaburajan	a94c0eea94	AK: Add trim_ascii_whitespace method to String	2023-08-06 22:21:10 +02:00
Tim Schumacher	a3f73e7d85	AK: Rename Stream::read_entire_buffer to Stream::read_until_filled No functional changes.	2023-03-13 15:16:20 +00:00
Andreas Kling	d517e7fb3a	AK: Make FlyString::hash() use the cached hash in StringData if possible This avoids rehashing the string every time.	2023-03-09 21:54:59 +01:00
Timothy Flynn	515fca4f7a	AK: Make String::contains(code_point) handle non-ASCII We currently only accept a char, instead of a full code point.	2023-03-08 14:16:47 +00:00
Timothy Flynn	f882581e91	AK: Make String::{starts,ends}_with(code_point) handle non-ASCII We currently pass the code point to StringView::{starts,ends}_with, which actually accepts a single char, thus cannot handle non-ASCII code points.	2023-03-08 14:16:47 +00:00
Timothy Flynn	da0d000909	AK: Ensure short String instances are valid UTF-8 We are currently only validating long strings.	2023-03-03 11:46:42 -05:00
Linus Groh	45dc3d8a3e	AK: Add String::ends_with{,_bytes}()	2023-03-03 11:02:21 +00:00
Ali Mohammad Pur	79e4027480	AK: Add two starts_with{bytes,}() APIs to String	2023-02-28 15:52:24 +03:30
Tim Schumacher	9096b4d893	AK: Ensure that we fill the whole String when reading from a Stream	2023-02-21 22:28:15 -07:00
Andrew Kaster	0ea697ace5	AK: Add String::from_stream method The caller is responsible for determining how long the string is that they want to read.	2023-02-21 10:57:44 +01:00
Andreas Kling	e08c55dd8d	AK: Make String const-correct internally	2023-02-21 00:54:04 +01:00
Andreas Kling	d0697d350d	AK: Fix 64-bit alignment issue in shared-superstring substrings Thanks to Timothy Flynn for the test! Fixes #17141	2023-02-18 09:12:46 -05:00
Timothy Flynn	c59268d15b	AK: Add String::trim	2023-01-28 00:13:46 +00:00
Timothy Flynn	cccaa94767	AK: Add String::join	2023-01-28 00:13:46 +00:00
Timothy Flynn	c35b1371a3	AK: Add an overload of String::find_byte_offset for StringView	2023-01-27 18:00:17 +00:00
Timothy Flynn	76fd5f2756	AK: Add convenience substring wrappers to String to exclude a length These overloads exist on other string classes and are used throughout the code base.	2023-01-24 16:23:50 -05:00
Timothy Flynn	427b82065c	AK: Add a method to create a String with a repeated code point	2023-01-24 16:23:50 -05:00
Timothy Flynn	d50724956e	AK: Add a method to find the byte offset of a code point	2023-01-24 16:23:50 -05:00
Timothy Flynn	ef275e25b8	AK: Reduce String's allocated data by one byte This was copied from allocation_size_for_stringimpl, which had to ensure the string is null-terminated. String makes no such guarantee.	2023-01-22 20:27:52 +00:00
Timothy Flynn	8aca8e82cb	AK: Change String's default constructor to be constant This allows creating expressions such as: constexpr Array<String, 10> {};	2023-01-22 01:03:13 +00:00
martinfalisse	aec2dadfdd	AK: Add `split()` for `String`	2023-01-21 14:35:00 +01:00
Timothy Flynn	d48266a420	AK: Support creating known short string literals at compile time In cases where we know a string literal will fit in the short string storage, we can do so at compile time without needing to handle error propagation. If the provided string literal is too long, a compilation error will be emitted due to the failed VERIFY statement being a non- constant expression.	2023-01-20 14:24:12 -05:00
Timothy Flynn	cf0899f440	AK: Add String::contains	2023-01-15 01:00:20 +00:00
Timothy Flynn	9db9b2f9be	AK: Add a somewhat naive implementation of String::reverse This will reverse the String's code points (i.e. not just its bytes), but is not aware of grapheme clusters.	2023-01-15 01:00:20 +00:00
Timothy Flynn	1d4f287582	AK: Implement FlyString for the new String class This implements a FlyString that will de-duplicate String instances. The FlyString will store the raw encoded data of the String instance: If the String is a short string, FlyString holds the String::ShortString bytes; otherwise FlyString holds a pointer to the Detail::StringData. FlyString itself does not know about String's storage or how to refcount its Detail::StringData. It defers to String to implement these details.	2023-01-12 11:23:58 +01:00

1 2 3 4

187 commits