Commit graph

176 commits

Author SHA1 Message Date
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Shannon Booth
5f2f26451d AK: Disallow String::from_utf8 on FlyString and String 2023-12-10 09:45:03 +01:00
Bastiaan van der Plaat
4a7d3115c9 AK: Add String to number floating point support 2023-12-04 19:54:43 +00:00
Shannon Booth
6b32a1f18f AK+LibUnicode: Expose TrailingCodePointTransformation in to_titlecase
Relocating the definition of this enum from LibUnicode to AK.
2023-11-28 17:15:27 -05:00
Tim Schumacher
a2f60911fe AK: Rename GenericTraits to DefaultTraits
This feels like a more fitting name for something that provides the
default values for Traits.
2023-11-09 10:05:51 -05:00
Andreas Kling
0902f552a3 AK: Bring some missing DeprecatedString API over to String
Specifically, case sensitivity parameters for starts/ends with,
and the equals_ignoring_ascii_case() helper.
2023-11-04 21:28:30 +01:00
Andreas Kling
1e820385d9 AK: Add case-insensitive hashing for the new String classes
Bringing over this functionality from DeprecatedString.
2023-09-06 11:29:03 -04:00
Lucas CHOLLET
fde26c53f0 AK: Remove the API to explicitly construct short strings
Now that ""_string is infallible, the only benefit of explicitly
constructing a short string is the ability to do it at compile-time. But
we never do that, so let's simplify the API and remove this
implementation detail from it.
2023-08-08 07:37:21 +02:00
Andreas Kling
34344120f2 AK: Make "foo"_string infallible
Stop worrying about tiny OOMs.

Work towards #20405.
2023-08-07 16:03:27 +02:00
aryanbaburajan
a94c0eea94 AK: Add trim_ascii_whitespace method to String 2023-08-06 22:21:10 +02:00
Andrew Kaster
3533d3e452 AK: Enable consteval workaround for Android NDK
Android isn't shipping clang-15 yet in any NDK, so use the existing
workaround on that platform.
2023-07-19 04:22:28 -06:00
Andrew Kaster
bfd6deed1e AK+Meta: Disable consteval completely when building for oss-fuzz
This was missed in 02b74e5a70

We need to disable consteval in AK::String as well as AK::StringView,
and we need to disable it when building both the tools build and the
fuzzer build.
2023-06-29 15:55:54 -06:00
Hendiadyoin1
ca0106ba1d AK: Forbid from_utf8 and from_deprecated_{...} with unintended types
Calling `from_utf8` with a DeprecatedString will hide the fact that we
have a DeprecatedString, while using `from_deprecated_string` with a
StringView will silently and needlessly allocate a DeprecatedString,
so let's forbid that.
2023-06-13 01:49:02 +02:00
Timothy Flynn
d6b786b3fe AK: Use consteval String factories on macOS
Xcode 14.3 ships with clang 15, which supports our usage of consteval to
validate short strings at compile time.
2023-05-08 20:54:31 -06:00
thankyouverycool
9a03e4dd73 AK: Add count() helper to String 2023-04-30 05:48:14 +02:00
Andreas Kling
d517e7fb3a AK: Make FlyString::hash() use the cached hash in StringData if possible
This avoids rehashing the string every time.
2023-03-09 21:54:59 +01:00
Timothy Flynn
1393ed2000 AK+LibUnicode: Implement String::equals_ignoring_case without allocating
We currently fully casefold the left- and right-hand sides to compare
two strings with case-insensitivity. Now, we casefold one code point at
a time, storing the result in a view for comparison, until we exhaust
both strings.
2023-03-08 18:57:53 +00:00
Timothy Flynn
515fca4f7a AK: Make String::contains(code_point) handle non-ASCII
We currently only accept a char, instead of a full code point.
2023-03-08 14:16:47 +00:00
Timothy Flynn
f882581e91 AK: Make String::{starts,ends}_with(code_point) handle non-ASCII
We currently pass the code point to StringView::{starts,ends}_with,
which actually accepts a single char, thus cannot handle non-ASCII
code points.
2023-03-08 14:16:47 +00:00
Timothy Flynn
da0d000909 AK: Ensure short String instances are valid UTF-8
We are currently only validating long strings.
2023-03-03 11:46:42 -05:00
Linus Groh
45dc3d8a3e AK: Add String::ends_with{,_bytes}() 2023-03-03 11:02:21 +00:00
Ali Mohammad Pur
79e4027480 AK: Add two starts_with{bytes,}() APIs to String 2023-02-28 15:52:24 +03:30
Timothy Flynn
5eec76b441 AK: Use the same consteval condition on _short_string as its factory
This fixes the build with Apple Clang.
2023-02-25 22:25:05 +01:00
Linus Groh
85414d9338 AK: Add operator""_{short_,}string to create a String from a literal
We briefly discussed this when adding the new String type but couldn't
settle on a name. However, having to use String::from_utf8() on every
literal string is a bit unwieldy, so let's have these options available!

Naming-wise '_string' is not as short as 'sv' but should be relatively
clear; it also matches '_bigint' and '_ubigint' in length.
'_short_string' may be longer than the actual string itself, but it's
still an improvement over the static function :^)

Since our C++ source files are UTF-8 encoded anyway, it should be
impossible to create a string literal with invalid UTF-8, so including
that in the name is not as important as in the function that can receive
arbitrary data.
2023-02-25 20:51:49 +01:00
Andrew Kaster
0ea697ace5 AK: Add String::from_stream method
The caller is responsible for determining how long the string is that
they want to read.
2023-02-21 10:57:44 +01:00
Andreas Kling
e08c55dd8d AK: Make String const-correct internally 2023-02-21 00:54:04 +01:00
nipos
c31b547fae AK: Use constexpr instead of consteval on OpenBSD 2023-02-04 16:11:54 -07:00
Timothy Flynn
c59268d15b AK: Add String::trim 2023-01-28 00:13:46 +00:00
Timothy Flynn
cccaa94767 AK: Add String::join 2023-01-28 00:13:46 +00:00
Timothy Flynn
c35b1371a3 AK: Add an overload of String::find_byte_offset for StringView 2023-01-27 18:00:17 +00:00
Timothy Flynn
76fd5f2756 AK: Add convenience substring wrappers to String to exclude a length
These overloads exist on other string classes and are used throughout
the code base.
2023-01-24 16:23:50 -05:00
Timothy Flynn
427b82065c AK: Add a method to create a String with a repeated code point 2023-01-24 16:23:50 -05:00
Timothy Flynn
d50724956e AK: Add a method to find the byte offset of a code point 2023-01-24 16:23:50 -05:00
Timothy Flynn
5e44b93af2 AK: Remove [[nodiscard]] attribute from String methods returning ErrorOr 2023-01-24 16:23:50 -05:00
Timothy Flynn
12c8bc3e85 AK: Add a String factory to create a string from a single code point 2023-01-22 01:03:13 +00:00
Timothy Flynn
8aca8e82cb AK: Change String's default constructor to be constant
This allows creating expressions such as:

    constexpr Array<String, 10> {};
2023-01-22 01:03:13 +00:00
martinfalisse
aec2dadfdd AK: Add split() for String 2023-01-21 14:35:00 +01:00
Timothy Flynn
c8e25a71e0 AK: Disable use of consteval in String::from_utf8_short_string for Apple
This causes an ICE on older versions of clang, and Apple's clang is
currently based on such a version.
2023-01-20 20:33:04 +00:00
Timothy Flynn
d48266a420 AK: Support creating known short string literals at compile time
In cases where we know a string literal will fit in the short string
storage, we can do so at compile time without needing to handle error
propagation. If the provided string literal is too long, a compilation
error will be emitted due to the failed VERIFY statement being a non-
constant expression.
2023-01-20 14:24:12 -05:00
Timothy Flynn
537fcaf59e AK+LibUnicode: Provide Unicode-aware caseless String matching
The Unicode spec defines much more complicated caseless matching
algorithms in its Collation spec. This implements the "basic" case
folding comparison.
2023-01-18 14:43:40 +00:00
Timothy Flynn
d6ddca0c0f AK+LibUnicode: Provide Unicode-aware String titlecase transformation 2023-01-16 18:33:44 -05:00
Timothy Flynn
63c814fa2f AK: Add String::to_number 2023-01-15 01:00:20 +00:00
Timothy Flynn
cf0899f440 AK: Add String::contains 2023-01-15 01:00:20 +00:00
Timothy Flynn
bd9b65e82f AK: Add String::is_one_of for variadic string comparison 2023-01-15 01:00:20 +00:00
Timothy Flynn
9db9b2f9be AK: Add a somewhat naive implementation of String::reverse
This will reverse the String's code points (i.e. not just its bytes),
but is not aware of grapheme clusters.
2023-01-15 01:00:20 +00:00
MacDue
9a120d7243 AK: Add support for "debug only" formatters
These are formatters that can only be used with debug print
functions, such as dbgln(). Currently this is limited to
Formatter<ErrorOr<T>>. With this you can still debug log ErrorOr
values (good for debugging), but trying to use them in any
String::formatted() call will fail (which prevents .to_string()
errors with the new failable strings being ignored).

You make a formatter debug only by adding a constexpr method like:
static constexpr bool is_debug_only() { return true; }
2023-01-13 21:09:26 +00:00
Timothy Flynn
1d4f287582 AK: Implement FlyString for the new String class
This implements a FlyString that will de-duplicate String instances. The
FlyString will store the raw encoded data of the String instance: If the
String is a short string, FlyString holds the String::ShortString bytes;
otherwise FlyString holds a pointer to the Detail::StringData.

FlyString itself does not know about String's storage or how to refcount
its Detail::StringData. It defers to String to implement these details.
2023-01-12 11:23:58 +01:00
Timothy Flynn
6fcc1c7426 AK+LibUnicode: Provide Unicode-aware String case transformations
Since AK can't refer to LibUnicode directly, the strategy here is that
if you need case transformations, you can link LibUnicode and receive
them. If you try to use either of these methods without linking it, then
you'll of course get a linker error (note we don't do any fallbacks to
e.g. ASCII case transformations). If you don't need these methods, you
don't have to link LibUnicode.
2023-01-09 19:23:46 -07:00
kleines Filmröllchen
ca80353efe AK: Add comparison operator
s p a c e s h i p  o p e r a t o r

Comparing UTF-8 can be done by simple byte lexicographic comparison per
definition, so we just piggy-back on StringView's high-performance
comparator.
2022-12-11 16:05:23 +00:00
Moustafa Raafat
ae2abcebbb Everywhere: Use C++ concepts instead of requires clauses 2022-12-09 11:25:30 +00:00