The [UTF-8](https://datatracker.ietf.org/doc/html/rfc3629#page-5)
standard says to reject strings with upper or lower surrogates. However,
in many standards, ECMAScript included, unpaired surrogates (and
therefore UTF-8 surrogates) are allowed in strings. So, this commit
extends the UTF-8 validation API with `AllowSurrogates`, which will
reject upper and lower surrogate characters.
Format string checking was disabled in Clang-based builds due to a
compiler bug: https://github.com/llvm/llvm-project/issues/51182. Now
that the requirement has been raised to Clang 17, that is no longer
necessary.
This has been tested to work correctly with Apple Clang 15.0.0 (which is
the *least modern* supported compiler), as well as CLion 2024.1's
bundled Clangd.
Instead of being opt-out with NOESCAPE, it is now opt-in with ESCAPING.
Opt-out is ideal, but unfortunately this was extremely noisy when
compiling the entire codebase. Escaping functions are rarer than non-
escaping ones, so let's just go with that for now.
This also allows us to gradually add heuristics for detecting missing
ESCAPING annotations and emitting them as errors. It also nicely matches
the spelling that Swift uses (@escaping), which is where this idea
originally came from.
With this change, ".*make.*" function family now does error checking
earlier, which improves experience while using clangd. Note that the
change also make them instantiate classes a bit more eagerly, so in
LibVideo/PlaybackManager, we have to first define SeekingStateHandler
and only then make() it.
Co-Authored-By: stelar7 <dudedbz@gmail.com>
It wasn't safe to use addition_would_overflow(a, -b) to check if
subtraction (a - b) would overflow, since it doesn't cover this case.
I don't know why we didn't have subtraction_would_overflow(), so this
patch adds it. :^)
AK/Platform.h did not include any other header file, but expected
various macros to be defined. While many of the macros checked here are
predefined by the compiler (i.e. GCC's TARGET_OS_CPP_BUILTINS), some
may be defined by the system headers instead. In particular, so is
__GLIBC__ on glibc-based systems.
We have to include some system header for getting __GLIBC__ (or not).
It could be possible to include something relatively small and
innocuous, like <string.h> for example, but that would still clutter
the name space and make other code that would use <string.h>
functionality, but forget to include it, build on accident; we wouldn't
want that. At the end of the day, the header that actually defines
__GLIBC__ (or not) is <features.h>. It's typically included from other
glibc headers, and not by user code directly, which makes it unlikely
to mask other code accidentlly forgetting to include it, since it
wouldn't include it in the first place.
<features.h> is not defined by POSIX and could be missing on other
systems (but it seems to be present at least when using either glibc or
musl), so guard its inclusion with __has_include().
Specifically, this fixes AK/StackInfo.cpp not picking up the glibc code
path in the cross aarch64-gnu (GNU/Hurd on 64-bit ARM) Lagom build.
`ceil_div(-1, 2)` used to return -1.
Now it returns 0, which is the correct ceil(-0.5).
(C++'s division semantics have floor semantics for numbers > 0,
but ceil semantics for numbers < 0.)
This will be important for the JPEG2000 decoder eventually.
The following command was used to clang-format these files:
clang-format-18 -i $(find . \
-not \( -path "./\.*" -prune \) \
-not \( -path "./Base/*" -prune \) \
-not \( -path "./Build/*" -prune \) \
-not \( -path "./Toolchain/*" -prune \) \
-not \( -path "./Ports/*" -prune \) \
-type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")
There are a couple of weird cases where clang-format now thinks that a
pointer access in an initializer list, e.g. `m_member(ptr->foo)`, is a
lambda return statement, and it puts spaces around the `->`.
This allows us to easily use an appropriate integer type when performing
float bitfield operations.
This change also adds a comment about the technically-incorrect 80-bit
extended float mantissa field.
gcc can't seem to figure out that the address of a member variable of
AK::Atomic<u32> in AtomicRefCounted cannot be null when fetch_sub-ing.
Add a bogus condition to convince the compiler that it can't be null.
These changes allow lines of arbitrary length to be read with
BufferedStream. When the user supplied buffer is smaller than
the line, it will be resized to fit the line. When the internal
buffer in BufferedStream is smaller than the line, it will be
read into the user supplied buffer chunk by chunk with the
buffer growing accordingly.
Other behaviors match the behavior of the existing read_line method.
currently crashes with an assertion failure in `String::repeated` if
malloc can't serve a `count * input_size` sized request, so add
`String::repeated_with_error` to propagate the error.
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
Sticking this to the function source has multiple benefits:
- We instrument more code, by not excluding entire files.
- NO_SANITIZE_COVERAGE can be used in Header files.
- Keeping the info with the source code, means if a function or
file is moved around, the NO_SANITIZE_COVERAGE moves with it.
GCC sometimes complains about the The `no_sanitize("address")` syntax,
and clang sometimes complains abouth the `no_sanitize_address` syntax.
Both claim to support both, so that's neat!
On macOS, it's not trivial to get a Mach task port for your children.
This implementation registers the chrome process as a well-known
service with launchd based on its pid, and lets each child process
send over a reference to its mach_task_self() back to the chrome.
We'll need this Mach task port right to get process statistics.
For example, consider the following code snippet:
Vector<Function<void()>> m_callbacks;
void add_callback(Function<void()> callback)
{
m_callbacks.append(move(callback));
}
// Somewhere else...
void do_something()
{
int a = 10;
add_callback([&a] {
dbgln("a is {}", a);
});
} // Oops, "a" is now destroyed, but the callback in m_callbacks
// has a reference to it!
We now statically detect the capture of "a" in the lambda above and flag
it as incorrect. Note that capturing the value implicitly with a capture
list of `[&]` would also be detected.
Of course, many functions that accept Function<...> don't store them
anywhere, instead immediately invoking them inside of the function. To
avoid a warning in this case, the parameter can be annotated with
NOESCAPE to indicate that capturing stack variables is fine:
void do_something_now(NOESCAPE Function<...> callback)
{
callback(...)
}
Lastly, there are situations where the callback does generally escape,
but where the caller knows that it won't escape long enough to cause any
issues. For example, consider this fake example from LibWeb:
void do_something()
{
bool is_done = false;
HTML::queue_global_task([&] {
do_some_work();
is_done = true;
});
HTML::main_thread_event_loop().spin_until([&] {
return is_done;
});
}
In this case, we know that the lambda passed to queue_global_task will
be executed before the function returns, and will not persist
afterwards. To avoid this warning, annotate the type of the capture
with IGNORE_USE_IN_ESCAPING_LAMBDA:
void do_something()
{
IGNORE_USE_IN_ESCAPING_LAMBDA bool is_done = false;
// ...
}
All string types currently have to invoke this function as:
stream.write_until_depleted("foo"sv.bytes());
This isn't very ergonomic, but more importantly, this overload will
allow String/ByteString instances to be written in this manner once
e.g. `ByteString::view() &&` is deleted.
Rather than making a copy of the held string, this returns a reference
so that expressions like the following:
do_something(json.as_string().view());
are not disallowed once `ByteString::view() &&` is deleted.