Commit graph

26 commits

Author SHA1 Message Date
Ali Mohammad Pur
02b50d463b AK: Cache all the line positions in LineTrackingLexer
Also updates a LibWeb text test that used to report the wrong line
number.
2024-10-10 23:53:48 +01:00
Tim Ledbetter
5ca2f4dfd7 Everywhere: Remove all KERNEL #defines 2024-06-18 09:36:25 +02:00
Ali Mohammad Pur
bc301b6f40 AK+LibXML+JSSpecCompiler: Move LineTrackingLexer to AK
This is a simple extension of GenericLexer, and is used in more than
just LibXML, so let's move it into AK.
The move also resolves a FIXME, which is removed in this commit.
2024-02-16 15:26:43 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Dan Klishch
b65d281bbb AK: Add GenericLexer::{consume_decimal_integer,peek_string} 2023-11-04 18:06:30 +01:00
Ali Mohammad Pur
aeee98b3a1 AK+Everywhere: Remove the null state of DeprecatedString
This commit removes DeprecatedString's "null" state, and replaces all
its users with one of the following:
- A normal, empty DeprecatedString
- Optional<DeprecatedString>

Note that null states of DeprecatedFlyString/StringView/etc are *not*
affected by this commit. However, DeprecatedString::empty() is now
considered equal to a null StringView.
2023-10-13 18:33:21 +03:30
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Idan Horowitz
b22cb40565 AK: Exclude GenericLexer String APIs from the Kernel
These APIs are only used by userland, and String is OOM-infallible,
so let's just ifdef it out of the Kernel.
2022-02-16 22:21:37 +01:00
Idan Horowitz
67ce9e28a5 AK: Standardize the behaviour of GenericLexer::consume_until overloads
Before this commit all consume_until overloads aside from the Predicate
one would consume (and ignore) the stop char/string, while the
Predicate overload would not, in order to keep behaviour consistent,
the other overloads no longer consume the stop char/string as well.
2022-01-25 13:41:09 +03:30
Idan Horowitz
d49d2c7ec4 AK: Add a consume_until(StringView) overload to GenericLexer
This allows us to skip a strlen call.
2022-01-25 13:41:09 +03:30
Timothy Flynn
fd8ccedf2b AK: Add GenericLexer API to consume an escaped Unicode code point
This parsing is already duplicated between LibJS and LibRegex, and will
shortly be needed in more places in those libraries. Move it to AK to
prevent further duplication.

This API will consume escaped Unicode code points of the form:
    \\u{code point}
    \\unnnn (where each n is a hexadecimal digit)
    \\unnnn\\unnnn (where the two escaped values are a surrogate pair)
2021-08-19 23:49:25 +02:00
Lenny Maiorani
254e010c75 AK/GenericLexer: constexpr where possible
Problem:
- Much of the `GenericLexer` can be `constexpr`, but is not.

Solution:
- Make it `constexpr` and de-duplicate code.
- Extend some of `StringView` with `constexpr` to support.
- Add tests to ensure `constexpr` behavior.

Note:
- Construction of `StringView` from pointer and length is not
  `constexpr`-compatible at the moment because the VERIFY cannot be,
  yet.
2021-04-22 20:27:21 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Linus Groh
1daa5158eb AK: Add GenericLexer::retreat()
This allows going back one character at a time, and then re-consume
previously consumed chars.
The code I need this for looks something like this:

    ASSERT(lexer.consume_specific('\\'));
    if (lexer.next_is("foo"))
        ...
    lexer.retreat();
    lexer.consume_escaped_character();  // This expects lexer.peek() == '\\'
2020-10-29 11:52:31 +01:00
AnotherTest
27040e65eb AK: Add `GenericLexer::consume_escaped_character()'
...and use it in `consume_and_unescape_string()'.
2020-10-22 23:49:51 +02:00
asynts
6351a56d27 AK+Format: Do some housekeeping in the format implementation. 2020-10-02 20:48:19 +02:00
Benoît Lormeau
f0f6b09acb AK: Remove the ctype adapters and use the actual ctype functions instead
This finally takes care of the kind-of excessive boilerplate code that were the
ctype adapters. On the other hand, I had to link `LibC/ctype.cpp` to the Kernel
(for `AK/JsonParser.cpp` and `AK/Format.cpp`). The previous commit actually makes
sense now: the `string.h` includes in `ctype.{h,cpp}` would require to link more LibC
stuff to the Kernel when it only needs the `_ctype_` array of `ctype.cpp`, and there
wasn't any string stuff used in ctype.
Instead of all this I could have put static derivatives of `is_any_of()` in the
concerned AK files, however that would have meant more boilerplate and workarounds;
so I went for the Kernel approach.
2020-09-27 21:15:25 +02:00
Benoit Lormeau
e4da2875c5 AK: Use templates instead of Function for Conditions in the GenericLexer
Since commit 1ec59f28ce turns the ctype macros
into functions we can now feed them directly to a GenericLexer! This will lead to
removing the ctype adapters that were kind-of excessive boilerplate, but needed as
the Kernel doesn't compile with the LibC.
2020-09-27 21:15:25 +02:00
Benoit Lormeau
8f34b493e4 AK: Enhance GenericLexer's string consumption
The `consume_quoted_string()` can now take an escape character. This allows it
(for example) to capture a string's enclosing quotes. The escape character is
optional by default.

You can also consume and unescape a quoted string with the eponymous method
`consume_and_unescape_string()`. It takes an escape character as parameter
(backslash by default). It builds a String in which common escape sequences
get... unescaped :^) (e.g. \n, \r, \t...).
2020-09-26 17:17:53 +02:00
Benoit Lormeau
1ab6dd67e9 AK: Alphabetically sort the ctype adapters 2020-09-26 17:17:53 +02:00
Ben Wiederhake
8940bc3503 Meta+AK: Make clang-format-10 clean 2020-09-25 21:18:17 +02:00
Nico Weber
064159d215 LibWeb: Use GenericLexer in WrapperGenerator 2020-08-21 16:01:48 +02:00
Benoît Lormeau
7b356c33cb
AK: Add a GenericLexer and extend the JsonParser with it (#2696) 2020-08-09 11:34:26 +02:00