This allows us to pass the new String type to functions that take a
StringView directly, having to call bytes_as_string_view() every time
gets old quickly.
This makes more sense as an Object method rather than living within the
VM class for no good reason. Most of the other 7.3.xx AOs already work
the same way.
Also add spec comments while we're here.
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.
Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.
Problems in DeprecatedString:
- It assumes string allocation never fails. This makes it impossible
to use in allocation-sensitive contexts, and is the reason we had to
ban DeprecatedString from the kernel entirely.
- The awkward null state. DeprecatedString can be null. It's different
from the empty state, although null strings are considered empty.
All code is immediately nicer when using Optional<DeprecatedString>
but DeprecatedString came before Optional, which is how we ended up
like this.
- The encoding of the underlying data is ambiguous. For the most part,
we use it as if it's always UTF-8, but there have been cases where
we pass around strings in other encodings (e.g ISO8859-1)
- operator[] and length() are used to iterate over DeprecatedString one
byte at a time. This is done all over the codebase, and will *not*
give the right results unless the string is all ASCII.
How we solve these issues in the new String:
- Functions that may allocate now return ErrorOr<String> so that ENOMEM
errors can be passed to the caller.
- String has no null state. Use Optional<String> when needed.
- String is always UTF-8. This is validated when constructing a String.
We may need to add a bypass for this in the future, for cases where
you have a known-good string, but for now: validate all the things!
- There is no operator[] or length(). You can get the underlying data
with bytes(), but for iterating over code points, you should be using
an UTF-8 iterator.
Furthermore, it has two nifty new features:
- String implements a small string optimization (SSO) for strings that
can fit entirely within a pointer. This means up to 3 bytes on 32-bit
platforms, and 7 bytes on 64-bit platforms. Such small strings will
not be heap-allocated.
- String can create substrings without making a deep copy of the
substring. Instead, the superstring gets +1 refcount from the
substring, and it acts like a view into the superstring. To make
substrings like this, use the substring_with_shared_superstring() API.
One caveat:
- String does not guarantee that the underlying data is null-terminated
like DeprecatedString does today. While this was nifty in a handful of
places where we were calling C functions, it did stand in the way of
shared-superstring substrings.
This is still not perfect, as we now actually crash in the
`try-finally-continue` tests, while we now succeed all
`try-catch-finally-*` tests.
Note that we do not yet go through the finally block when exiting the
unwind context through a break or continue.
We are already doing this in a good manner via the generated code,
doing so in the execution loop as well will cause us to pop contexts
multiple times, which is not very good.
Before we were doing so while exiting the catch-block, but not when
exiting the try-block.
This now centralizes the responsibility to exit the unwind context to
the finalizer, ignoring return/break/continue.
This makes it easier to handle the return case in a future commit.
Previously we allowed the end_offset to be larger than the chunk itself,
which made it so that certain input sizes would make the logic attempt
to delete a nonexistent object.
Fixes#16308.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Currently placing floating blocks in anonymous nodes makes
https://stackoverflow.com/ hang so let's leave it to try
to place only absolute blocks in anonymous nodes for now.
Also it breaks flex formatting when element with floating is
flex child.
The runtime environment of the WASM REPL does not have time zone
information; the file system is virtual (does not have /etc/localtime),
and the TZ environment variable is not set. This causes LibTimeZone to
always fall back to UTC.
Instead, we can get the time zone from the user's browser before we
enter this limited environment. The REPL website will pass the time zone
into the WASM REPL.
Without this, we were in a weird state where LibTimeZone believed it had
TZDB data, but that data wasn't actually available to it. This caused
functions like JS::get_named_time_zone_offset_nanoseconds() to trip an
assertion when entering "new Date();" into the REPL.
This doesn't have any immediate uses, but this adapts the code a bit
more to `Core::Stream` conventions (as most functions there use
NonnullOwnPtr to handle streams) and it makes it a bit clearer that this
pointer isn't actually supposed to be null. In fact, MP3LoaderPlugin
and FlacLoaderPlugin apparently forgot to check for that completely
before starting to decode data.
This now prepares all the needed (fallible) components before actually
constructing a LoaderPlugin object, so we are no longer filling them in
at an arbitrary later point in time.
`Bytes` is very slim, so the memory and/or performance gains from
passing it by reference isn't that big, and it passing it by value is
more compatible with xvalues, which is handy for things like
`::try_create(buffer.bytes())`.
This fixes a rendering issue where box-shadows would not appear or
render completely broken if the blur radius was larger than the
border radius (border-radius < 2 * blur-radius to be exact).
Here I try to address bug where content of table overflows
it's width (hacker news is an example of such site) by
reimplementing some parts of table formatting context.
Now TFC implements first steps of:
https://www.w3.org/TR/css-tables-3/#table-layout-algorithm
but column width and row height distribution steps are
still very incomplete.
This patch adds a way to ask the allocator to skip its internal
scrubbing memset operation. Before this change, kcalloc() would scrub
twice: once internally in kmalloc() and then again in kcalloc().
The same mechanism already existed in LibC malloc, and this patch
brings it over to the kernel heap allocator as well.
This solves one FIXME in kcalloc(). :^)
Changing the naming conventions one-by-one was tedious and error-prone.
A settings file is likely to be more forward compatible than a
screenshot. The settings file was made by repeating the manual steps
provided in the documentation, and exporting the file in CLion.
The two major changes noticeable on the SerenityOS codebase are:
- Much improved support for const placement, clang-format-14 ignored
our east-const configuration in various places
- Different formatting for requires clauses, now breaking them onto
their own line, which helps with readability a bit
Current versions of CLion also ship LLVM 15, so the built-in formatting
now matches CI formatting again :^)