Commit graph

140 commits

Author SHA1 Message Date
Timothy Flynn
3dccaa39d8 AK: Define a traits helper for case-insensitive StringView hashing
Currently, we define a CaseInsensitiveStringTraits structure for String.
Using this structure for StringView involves allocating a String from
that view, and a second string to convert that intermediate string to
lowercase.

This defines CaseInsensitiveStringViewTraits (and the underlying helper
case_insensitive_string_hash) to avoid allocations.
2022-01-11 00:36:45 +01:00
Hendiadyoin1
6cb42d8a40 AK: Verify that we are not overreaching in StringView's substring_view() 2021-11-16 00:49:48 +00:00
Andrew Kaster
64edf17eb2 AK: Mark StringView::find_any_of() as const 2021-11-14 22:52:35 +01:00
Andrew Kaster
22feb9d47b AK: Resolve clang-tidy readability-bool-conversion warnings
... In files included by Kernel/Process.cpp and Kernel/Thread.cpp
2021-11-14 22:52:35 +01:00
Andreas Kling
8b1108e485 Everywhere: Pass AK::StringView by value 2021-11-11 01:27:46 +01:00
Andreas Kling
5f7d008791 AK+Everywhere: Stop including Vector.h from StringView.h
Preparation for using Error.h from Vector.h. This required moving some
things out of line.
2021-11-10 21:58:58 +01:00
Idan Horowitz
6704961c82 AK: Replace the mutable String::replace API with an immutable version
This removes the awkward String::replace API which was the only String
API which mutated the String and replaces it with a new immutable
version that returns a new String with the replacements applied. This
also fixes a couple of UAFs that were caused by the use of this API.

As an optimization an equivalent StringView::replace API was also added
to remove an unnecessary String allocations in the format of:
`String { view }.replace(...);`
2021-09-11 20:36:43 +03:00
Idan Horowitz
6d2b003b6e AK: Make String::count not use strstr and take a StringView
This was needlessly copying StringView arguments, and was also using
strstr internally, which meant it was doing a bunch of unnecessary
strlen calls on it. This also moves the implementation to StringUtils
to allow API consistency between String and StringView.
2021-09-11 20:36:43 +03:00
Ben Wiederhake
9413dddb8b AK: Forbid creating StringView from temporary FlyString 2021-09-11 13:22:51 +03:00
Ben Wiederhake
f6d0955a46 AK: Forbid creating StringView from temporary ByteBuffer 2021-09-11 13:22:51 +03:00
Idan Horowitz
e8f6840471 AK+LibRegex: Disable construction of views from temporary Strings 2021-09-04 21:01:15 +02:00
Timothy Flynn
262e412634 AK: Implement method to convert a String/StringView to title case
This implementation preserves consecutive spaces in the orginal string.
2021-08-26 22:04:09 +01:00
Brian Gianforcaro
e9d8f158a1 AK+Kernel: StringView hash map Traits should not set peek type to String
This typo / bug in the Traits<T> implementation for StringView caused
AK::HashMap methods to return a `String` when looking up values out of
a hash map of type HashTable<StringView,StringView>.

This change fixes the typo, and fixes the only consumer, the kernel
Commandline class.
2021-08-18 10:21:19 +02:00
Timothy Flynn
011514a384 AK: Fix declaration of {String,StringView}::is_one_of
The declarations need to consume the variadic parameters as "Ts&&..."
for the parameters to be forwarding references.
2021-08-02 21:02:09 +04:30
Max Wipfli
9cc35d1ba3 AK: Implement String::find_any_of() and StringView::find_any_of()
This implements StringUtils::find_any_of() and uses it in
String::find_any_of() and StringView::find_any_of(). All uses of
find_{first,last}_of have been replaced with find_any_of(), find() or
find_last(). find_{first,last}_of have subsequently been removed.
2021-07-02 21:54:21 +02:00
Max Wipfli
d7a104c27c AK: Implement StringView::find_all()
This implements the StringView::find_all() method by re-implemeting the
current method existing for String in StringUtils, and using that
implementation for both String and StringView.

The rewrite uses memmem() instead of strstr(), so the String::find_all()
argument type has been changed from String to StringView, as the null
byte is no longer required.
2021-07-02 21:54:21 +02:00
Max Wipfli
3bdaed501e AK+Everywhere: Remove StringView::find_{first,last}_of(char) methods
This removes StringView::find_first_of(char) and find_last_of(char) and
replaces all its usages with find and find_last respectively. This is
because those two methods are functionally equivalent.
find_{first,last}_of should only be used if searching for multiple
different characters, which is never the case with the char argument.

This also adds the [[nodiscard]] to the remaining find_{first,last}_of
methods.
2021-07-02 21:54:21 +02:00
Max Wipfli
56253bf389 AK: Reimplement StringView::find methods in StringUtils
This patch reimplements the StringView::find methods in StringUtils, so
they can also be used by String. The methods now also take an optional
start parameter, which moves their API in line with String's respective
methods.

This also implements a StringView::find_ast(char) method, which is
currently functionally equivalent to find_last_of(char). This is because
find_last_of(char) will be removed in a further commit.
2021-07-02 21:54:21 +02:00
Max Wipfli
3ea65200d8 AK: Implement StringView::to_{lower,upper}case_string
This patch refactors StringImpl::to_{lower,upper}case to use the new
static methods StringImpl::create_{lower,upper}cased if they have to use
to create a new StringImpl. This allows implementing StringView's
to_{lower,upper}case_string using the same methods.

It also replaces the usage of hand-written to_ascii_lowercase() and
similar methods with those from CharacterTypes.h.
2021-07-02 21:54:21 +02:00
Ali Mohammad Pur
37b0f55104 AK: Make the constexpr StringView methods actually constexpr
Also add some tests to ensure that they _remain_ constexpr.
In general, any runtime assertions, weirdo C casts, pointer aliasing,
and such shenanigans should be gated behind the (helpfully newly added)
AK::is_constant_evaluated() function when the intention is to write
constexpr-capable code.
a.k.a. deliver promises of constexpr-ness :P
2021-06-27 20:54:59 +01:00
Ali Mohammad Pur
824a40e95b AK: Inline *String::is_one_of<Ts...>()
Previously this was generating a crazy number of symbols, and it was
also pretty-damn-slow as it was defined recursively, which made the
compiler incapable of inlining it (due to the many many layers of
recursion before it terminated).
This commit replaces the recursion with a pack expansion and marks it
always-inline.
2021-06-04 12:57:14 +02:00
Max Wipfli
0e4f7aa8e8 AK: Add trim() method to String, StringView and StringUtils
The methods added make it possible to use the trim mechanism with
specified characters, unlike trim_whitespace(), which uses predefined
characters.
2021-06-01 09:28:05 +02:00
Andreas Kling
3e603b2f32 AK: Make StringView::hash() constexpr
This required moving string_hash() to its own header so that everyone
can see it.
2021-05-14 15:24:32 +02:00
Lenny Maiorani
254e010c75 AK/GenericLexer: constexpr where possible
Problem:
- Much of the `GenericLexer` can be `constexpr`, but is not.

Solution:
- Make it `constexpr` and de-duplicate code.
- Extend some of `StringView` with `constexpr` to support.
- Add tests to ensure `constexpr` behavior.

Note:
- Construction of `StringView` from pointer and length is not
  `constexpr`-compatible at the moment because the VERIFY cannot be,
  yet.
2021-04-22 20:27:21 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling
873da38d0e AK: Remove String-from-StringView optimization
We had an unusual optimization in AK::StringView where constructing
a StringView from a String would cause it to remember the internal
StringImpl pointer of the String.

This was used to make constructing a String from a StringView fast
and copy-free.

I tried removing this optimization and indeed we started seeing a
ton of allocation traffic. However, all of it was due to a silly
pattern where functions would take a StringView and then go on
to create a String from it.

I've gone through most of the code and updated those functions to
simply take a String directly instead, which now makes this
optimization unnecessary, and indeed a source of bloat instead.

So, let's get rid of it and make StringView a little smaller. :^)
2021-04-17 01:27:31 +02:00
Timothy Flynn
2370efbea6 AK: Add a predicate variant of StringView::split_view 2021-04-12 22:37:00 +02:00
Brian Gianforcaro
75e7c780e7 AK: Annotate StringView functions as [[nodiscard]] 2021-04-11 12:50:33 +02:00
Andreas Kling
42133a196a AK: Don't compare past '\0' in StringView::operator==(const char*)
We kept scanning the needle string even after hitting a null terminator
and that's clearly not right.

Found by oss-fuzz:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31338
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31351
2021-02-24 22:13:04 +01:00
Brian Gianforcaro
31e1b08e15 AK: Add support for AK::StringView literals with operator""sv
A new operator, operator""sv was added as of C++17 to support
string_view literals. This allows string_views to be constructed
from string literals and with no runtime cost to find the string
length.

See: https://en.cppreference.com/w/cpp/string/basic_string_view/operator%22%22sv

This change implements that functionality in AK::StringView.
We do have to suppress some warnings about implementing reserved
operators as we are essentially implementing STL functions in AK
as we have no STL :).
2021-02-24 14:38:31 +01:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Andreas Kling
31ac93d051 AK: Optimize StringView::operator==(const char*) a little bit
Don't compute the strlen() of the string we're comparing against first.
This can save a lot of time if we're comparing against something that
already fails to match in the first few characters.
2021-02-23 17:41:18 +01:00
AnotherTest
39442e6d4f AK: Add String{View,}::find(StringView)
I personally mistook `find_first_of(StringView)` to be analogous to this
so let's add a `find()` method that actually searches the string.
2021-01-12 23:36:20 +01:00
Lenny Maiorani
e6f907a155 AK: Simplify constructors and conversions from nullptr_t
Problem:
- Many constructors are defined as `{}` rather than using the ` =
  default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
  instead of requiring the caller to default construct. This violates
  the C++ Core Guidelines suggestion to declare single-argument
  constructors explicit
  (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).

Solution:
- Change default constructors to use the compiler-provided default
  constructor.
- Remove implicit conversion operators from `nullptr_t` and change
  usage to enforce type consistency without conversion.
2021-01-12 09:11:45 +01:00
AnotherTest
f3ecea1fb3 AK: Add String{,View}::is_whitespace()
+Tests!
2021-01-03 10:47:29 +01:00
Sahan Fernando
37df4bbd90 AK: Generalize AK::String::to_int() for more types 2020-12-21 00:15:44 +01:00
Nico Weber
e673abb93f AK: Remove duplicate begin()/end() methods
begin()/end() returning a ConstItertor already exist further up
in this file. Nothing uses these redundant versions, and they are not
callable.
2020-11-07 18:28:35 +01:00
AnotherTest
0801b1fada AK: Make String::matches() capable of reporting match positions too
Also, rewrite StringUtils::match(), because the old implementation was
fairly broken, e.g. "acdcxb" would *not* match "a*?b".
2020-10-29 11:53:01 +01:00
AnotherTest
f0e59f2dac AK: Add a `is_one_of()' to StringView
This copies the similar API from String.
2020-10-29 11:53:01 +01:00
Tom
25e7225782 AK: Enhance String::contains to allow case-insensitive searches 2020-10-22 15:23:45 +02:00
Matthew Olsson
67f2301150 AK: Make StringView hashable 2020-10-08 23:27:16 +02:00
AnotherTest
5fbec2b003 AK: Move trim_whitespace() into StringUtils and add it to StringView
No behaviour change; also patches use of `String::TrimMode` in LibJS.
2020-09-27 21:14:18 +02:00
asynts
d831b5738d AK: Add StringView::substring_view(size_t) overload. 2020-09-21 20:17:36 +02:00
asynts
1b3ecb01a5 AK: Add generic SimpleIterator class to replace VectorIterator. 2020-09-08 14:01:21 +02:00
asynts
ebce4ead40 AK: Add StringView(ReadonlyBytes) constructor. 2020-08-20 16:28:31 +02:00
Nico Weber
6613a4cb8c disasm: Insert symbol names in disassembly stream
The symbol name insertion scheme is different from objdump -d's.
Compare the output on Build/Userland/id:

* disasm:

        ...
        _start (08048305-0804836b):
        08048305  push ebp
        ...
        08048366  call 0x0000df56

        0804836b  o16 nop
        0804836d  o16 nop
        0804836f  nop

        (deregister_tm_clones (08048370-08048370))

        08048370  mov eax, 0x080643e0
        ...
        _ZN2AK8Utf8ViewC1ERKNS_6StringE (0805d9b2-0805d9b7):
        _ZN2AK8Utf8ViewC2ERKNS_6StringE (0805d9b2-0805d9b7):
        0805d9b2  jmp 0x00014ff2

        0805d9b7  nop

* objdump -d:

        08048305 <_start>:
         8048305:	55                   	push   %ebp
        ...
         8048366:	e8 9b dc 00 00       	call   8056006 <exit>
         804836b:	66 90                	xchg   %ax,%ax
         804836d:	66 90                	xchg   %ax,%ax
         804836f:	90                   	nop

        08048370 <deregister_tm_clones>:
         8048370:	b8 e0 43 06 08       	mov    $0x80643e0,%eax
        ...
        0805d9b2 <_ZN2AK8Utf8ViewC1ERKNS_6StringE>:
         805d9b2:	e9 eb f6 ff ff       	jmp    805d0a2 <_ZN2AK10StringViewC1ERKNS_6StringE>
         805d9b7:	90                   	nop

Differences:

1. disasm can show multiple symbols that cover the same instructions.
   I've only seen this happen for C1/C2 (and D1/D2) ctor/dtor pairs,
   but it could conceivably happen with ICF as well.

2. disasm separates instructions that do not belong to a symbol with
   a newline, so that nop padding isn't shown as part of a function
   when it technically isn't.

3. disasm shows symbols that are skipped (due to having size 0)
   in parenthesis, separated from preceding and following instructions.
2020-08-10 11:48:10 +02:00
AnotherTest
fd64be02e0 AK: Add StringView::contains(StringView) 2020-08-01 08:39:26 +02:00
asynts
a922abd9d7 AK: Add span() / bytes() methods to container types. 2020-07-27 19:58:09 +02:00
Luke
a5ecb9bd6b AK: Add case insensitive version of starts_with 2020-07-21 01:08:32 +02:00
Andreas Kling
fdfda6dec2 AK: Make string-to-number conversion helpers return Optional
Get rid of the weird old signature:

- int StringType::to_int(bool& ok) const

And replace it with sensible new signature:

- Optional<int> StringType::to_int() const
2020-06-12 21:28:55 +02:00
Andreas Kling
83ee76a53e AK: Add StringView::{begin,end} so we can range-for over StringViews 2020-06-07 21:05:05 +02:00
Andreas Kling
4e80f22cc0 AK: Make some StringView constructors constexpr 2020-05-30 17:47:50 +02:00
AnotherTest
f9bf2c7e78 AK: Add StringView::split_view() taking a StringView
Since the task of splitting a string via another is pretty common, we
might as well have this overload of split_view() as well.
2020-05-28 11:01:08 +02:00
Brian Gianforcaro
129462cca7 AK: Unify FlyString/StringView::ends_with implementation on StringUtils::ends_with
This creates a unified implementation of ends_with with case sensitivity
across String/StringView/FlyString.
2020-05-26 13:17:19 +02:00
Linus Groh
1277290a4a AK: Add StringView::equals_ignoring_case()
StringUtils::equals_ignoring_case() already operates on a StringView&,
so StringView should have the method directly without having to go
through a temporary String (which also has the method).
2020-05-13 19:25:49 +02:00
Emanuel Sprung
8fe821fae2 AK: Add to_string() method to StringView
This allows easy creation of a new string from an existing StringView.
Can be used e.g. for output with printf(..., view.to_string().characters())
instead of writing printf(..., String{view}.characters()).
2020-05-06 19:07:22 +02:00
Andrew Kaster
3eb3c2477f AK: Add StringView::find_first/last_of
These methods search from the beginning or end of a string for the
first character in the input StringView and returns the position in
the string of the first match. Note that this is not a substring match.
Each comes with single char overloads for efficiency.
2020-05-04 09:39:05 +02:00
Andreas Kling
888e35f0fe AK: Add ALWAYS_INLINE, NEVER_INLINE and FLATTEN macros
It's tedious to write (and look at) [[gnu::always_inline]] etc. :^)
2020-04-30 11:43:25 +02:00
Sergey Bugaev
279cf9294a AK: Always inline trivial StringView constructors 2020-04-30 11:30:27 +02:00
Sergey Bugaev
135d29b498 AK: Assert that we don't create StringViews of negative length
Due to us using size_t for the length, the actual value will always be positive.
If, for example, we calculate the length as "0 - 1", we'll get SIZE_T_MAX. What
we can do is check that adding the characters pointer and the length together
doesn't overflow.
2020-04-30 11:30:27 +02:00
Stephan Unverwerth
b77ceecb02 AK: Add StringView::contains(char) 2020-04-17 15:22:31 +02:00
Andreas Kling
26bc3d4ea0 AK: Add FlyString::equals_ignoring_case(StringView)
And share the code with String by moving the logic to StringUtils. :^)
2020-03-22 13:07:45 +01:00
Andreas Kling
4f72f6b886 AK: Add FlyString, a simple flyweight string class
FlyString is a flyweight string class that wraps a RefPtr<StringImpl>
known to be unique among the set of FlyStrings. The class is very
unoptimized at the moment.

When to use FlyString:

- When you want O(1) string comparison
- When you want to deduplicate a lot of identical strings

When not to use FlyString:

- For strings that don't need either of the above features
- For strings that are likely to be unique
2020-03-22 13:03:43 +01:00
howar6hill
e07f50c398 AK: Add begin() and end() to String and StringView
Now it's possible to use range-based for loops with String and StringView.
2020-03-10 10:25:21 +01:00
Andreas Kling
900f51ccd0 AK: Move memory stuff (fast memcpy, etc) to a separate header
Move the "fast memcpy" stuff out of StdLibExtras.h and into Memory.h.
This will break a ton of things that were relying on StdLibExtras.h
to include a bunch of other headers. Fix will follow immediately after.

This makes it possible to include StdLibExtras.h from Types.h, which is
the main point of this exercise.
2020-03-08 13:06:51 +01:00
howar6hill
d75fa80a7b
AK: Move to_int(), to_uint() implementations to StringUtils (#1338)
Provide wrappers in String and StringView. Add some tests for the
implementations.
2020-03-02 14:19:33 +01:00
howar6hill
055344f346 AK: Move the wildcard-matching implementation to StringUtils
Provide wrappers in the String and StringView classes, and add some tests.
2020-03-02 10:38:08 +01:00
Shannon Booth
854f0b9e1a AK: Add StringView::starts_with(char) & StringView::ends_with(char)
This is simply meant to be a more efficient implementation in the
case that we only need to check a single character.
2020-02-22 21:36:54 +01:00
Andreas Kling
3bbf4610d2 AK: Add a forward declaration header
You can now #include <AK/Forward.h> to get most of the AK types as
forward declarations.

Header dependency explosion is one of the main contributors to compile
times at the moment, so this is a step towards smaller include graphs.
2020-02-14 23:31:18 +01:00
Andreas Kling
268000e166 AK: Always inline StringView(const char*)
Also use strlen() instead of manually walking the string. This allows
GCC to optimize away the strlen() entirely for string literals. :^)
2020-02-01 13:54:13 +01:00
Andreas Kling
94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Shannon Booth
24bc674d94 AK: Add StringView::ends_with function 2019-12-29 22:04:22 +01:00
Andreas Kling
6f4c380d95 AK: Use size_t for the length of strings
Using int was a mistake. This patch changes String, StringImpl,
StringView and StringBuilder to use size_t instead of int for lengths.
Obviously a lot of code needs to change as a result of this.
2019-12-09 17:51:21 +01:00
Tommy Nguyen
2eb5793d55 LibMarkdown: Handle CRLF line endings
Previously, MDDocument only split on Unix-style line endings. This adds
a new function to StringView which handles LF, CR and CRLF.
2019-12-02 13:52:42 +01:00
Sergey Bugaev
127d168def AK: Add a keep_empty argument to String[View]::substring{_view} 2019-09-28 18:29:42 +02:00
MinusGix
05f641a5a9 StringView: Add starts_with method 2019-09-13 09:22:30 +02:00
Andreas Kling
d38bd3935b AK: Add StringView::hash()
This grabs the hash from the underlying StringImpl if there is one,
otherwise it's computed on the fly.
2019-08-25 06:45:31 +02:00
Andreas Kling
2349dc1a21 StringView: Add StringView::operator==(StringView)
Previously we'd implicitly convert the second StringView to a String
when comparing two StringViews, which is obviously not what we wanted.
2019-08-15 14:09:27 +02:00
Andreas Kling
cce2ea9bb0 AK: Add StringView::to_int()
This is a shameless copy-paste of String::to_int(). We should find some
way to share this code between String and StringView instead of having
two duplicate copies like this.
2019-08-04 11:44:20 +02:00
Andreas Kling
0e75aba7c3 StringView: Rename characters() to characters_without_null_termination().
This should make you think twice before trying to use the const char* from
a StringView as if it's a null-terminated string.
2019-07-08 15:38:44 +02:00
Andreas Kling
50677a58d4 StringView: Make it easy to construct from a ByteBuffer. 2019-06-29 12:07:46 +02:00
Sergey Bugaev
1a697f70db AK: Add more StringView utilities for making substrings.
These two allow making a new substring view starting from,
or starting after, an existing substring view.

Also make use of one of them in the kernel.
2019-06-14 06:24:02 +02:00
Andreas Kling
cdb44be703 StringView: Store a StringImpl* rather than a String*. 2019-06-08 23:55:13 +02:00
Andreas Kling
6a51093ab1 AK: Add String::operator==(const char*).
Without this function, comparing a String to a const char* will instantiate
a temporary String which is obviously not great.

Also add some missing null checks to StringView::operator==(const char*).
2019-06-08 18:32:09 +02:00
Andreas Kling
de9edb0169 StringView: operator==(const char*) needs to stop when the view ends.
We were comparing past the end of the view, which was clearly not correct.
2019-06-07 19:22:58 +02:00
Robin Burchell
f9ba7adae2 StringView: Make construction of String from a StringView containing a String cheaper
... at the cost of an additional pointer per view.
2019-06-03 20:27:05 +02:00
Robin Burchell
b55b6cd7fc AK: Add implicit String -> StringView conversion
And tidy up existing view() users.
2019-06-02 12:55:51 +02:00
Robin Burchell
0dc9af5f7e Add clang-format file
Also run it across the whole tree to get everything using the One True Style.
We don't yet run this in an automated fashion as it's a little slow, but
there is a snippet to do so in makeall.sh.
2019-05-28 17:31:20 +02:00
Andreas Kling
33920df299 AK: Try to use StringViews more for substrings and splitting. 2019-04-16 02:39:16 +02:00
Andreas Kling
461aa550eb AK: Add a StringView class. 2019-04-15 14:56:37 +02:00