By appending individual bytes as code points, we were "breaking apart"
multi-byte UTF-8 code points. This now behaves the same way as the
invert_case() helper in StringUtils.
I'm not sure there's a material improvement from this patch. However,
I've been reading the error handling code from multiple projects and
was excited to see Serenity being able to handle assignment
(`auto x = TRY(make_x())`) the same way as actions (`TRY(do_x())`).
I think it's worth documenting that this is only possible due to
non-standard extensions.
This lets us remove a glob pattern from LibC, the DynamicLoader, and,
later, Lagom. The Kernel already has its own separate list of AK files
that it wants, which is only a subset of all AK files.
This prevents an ICE with GCC trying to declare e.g. Variant<String&>.
Using a concept is a bit overkill here, but clang otherwise trips over
the friendship declaration to other Variant types:
template<typename... NewTs>
friend struct Variant;
Without using a concept, clang believes this is re-declaring the Variant
type with differing requirements ("error: requires clause differs in
template redeclaration").
Even though this almost certainly wouldn't run properly even if we had
a working kernel for AARCH64 this at least lets us build all the
userland binaries.
WebDriver aims to implement the WebDriver specification found at
https://w3c.github.io/webdriver/webdriver-spec.html . It's an HTTP
server that can create Browser sessions and control them.
Co-authored-by: Florent Castelli <florent.castelli@gmail.com>
If the entire string you want to right-trim consists of characters you
want to remove, we previously would incorrectly leave the first
character there.
For example: `trim("aaaaa", "a")` would return "a" instead of "".
We can't use `i >= 0` in the loop since that would fail to detect
underflow, so instead we keep `i` in the range `size .. 1` and then
subtract 1 from it when reading the character.
Added some trim() tests while I was at it. (And to confirm that this was
the issue.)
Instead of doing anything reasonable, Utf8CodePointIterator returned
invalid code points, for example U+123456. However, many callers of this
iterator assume that a code point is always at most 0x10FFFF.
In fact, this is one of two reasons for the following OSS Fuzz issue:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49184
This is probably a very old bug.
In the particular case of URLParser, AK::is_url_code_point got confused:
return /* ... */ || code_point >= 0xA0;
If code_point is a "code point" beyond 0x10FFFF, this violates the
condition given in the preceding comment, but satisfies the given
condition, which eventually causes URLParser to crash.
This commit fixes *only* the erroneous UTF-8 decoding, and does not
fully resolve OSS-Fuzz#49184.
In particular, StringView::contains(char) is often used with a u32
code point. When this is done, the compiler will for some reason allow
data corruption to occur silently.
In fact, this is one of two reasons for the following OSS Fuzz issue:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49184
This is probably a very old bug.
In the particular case of URLParser, AK::is_url_code_point got confused:
return /* ... */ || "!$&'()*+,-./:;=?@_~"sv.contains(code_point);
If code_point is a large code point that happens to have the correct
lower bytes, AK::is_url_code_point is then convinced that the given
code point is okay, even if it is actually problematic.
This commit fixes *only* the silent data corruption due to the erroneous
conversion, and does not fully resolve OSS-Fuzz#49184.
GCC seems to get tripped up over this inheritance when converting from
an ErrorOr<StringView> to the partially specialized ErrorOr<void>. See
the following snippet:
NEVER_INLINE ErrorOr<StringView> foo()
{
auto string = "abc"sv;
outln("{:p}", string.characters_without_null_termination());
return string;
}
NEVER_INLINE ErrorOr<void> bar()
{
auto string = TRY(foo());
outln("{:p}", string.characters_without_null_termination());
VERIFY(!string.starts_with('#'));
return {};
}
int main()
{
MUST(bar());
}
On some machines, bar() will contain a StringView whose pointer has had
its upper bits set to 0:
0x000000010cafd6f8
0x000000000cafd6f8
I'm not 100% clear on what's happening in the default-generated Variant
destructor that causes this. Probably worth investigating further.
The error would also be alleviated by making the Variant destructor
virtual, but rather than that, let's make ErrorOr simply contain a
Variant rather than inherit from it.
Fixes#15449.
Until now, VERIFY() failures would just cause a __builtin_trap() in
release builds, which made them a bit too harsh. This commit adds an
out-of-line helper function that prints the error before trapping.
Doesn't use them in libc headers so that those don't have to pull in
AK/Platform.h.
AK_COMPILER_GCC is set _only_ for gcc, not for clang too. (__GNUC__ is
defined in clang builds as well.) Using AK_COMPILER_GCC simplifies
things some.
AK_COMPILER_CLANG isn't as much of a win, other than that it's
consistent with AK_COMPILER_GCC.
Clang patch D116203 added various builtin functions for type traits,
`__decay` being one of them. This name conflicts with our
`AK::Detail::__decay`, leading to compiler warnings with Clang 16.
This is the initial port of Lagom to win32. This will enable developers
to use Lagom as an alternative to vanilla STL/StandardC++Library - which
gives a much richer environment (think QtCore - but modern).
My main incentive - is to have a native Windows Ladybird working.
I am starting with AK, which does not yet fully compile (on mingw). When
AK is compiling (currently fails building StringBuffer.cpp) - I will
continue to LibCore and then the rest of the user space libraries
(excluding the GUI, which will be another different effort).
Most of the code is happily stollen from Andrew Kaster's fork - he
deserves the credit.
Co-authored-by: Andrew Kaster <akaster@serenityos.org>
URL had properly named replacements for protocol(), set_protocol() and
create_with_file_protocol() already. This patch removes these function
and updates all call sites to use the functions named according to the
specification.
See https://url.spec.whatwg.org/#concept-url-scheme
This code generator no longer creates JS wrappers for platform objects
in the old sense, instead they're JS objects internally themselves.
Most of what we generate now are prototypes - which can be seen as
bindings for the internal C++ methods implementing getters, setters, and
methods - as well as object constructors, i.e. bindings for the internal
create_with_global_object() method.
Also tweak the naming of various CMake glue code existing around this.
JS::Value stores 48 bit pointers to separately allocated objects in its
payload. On x86-64, canonical addresses have their top 16 bits set to
the same value as bit 47, effectively meaning that the value has to be
sign-extended to get the pointer. AArch64, however, expects the topmost
bits to be all zeros.
This commit gates sign extension behind `#if ARCH(X86_64)`, and adds an
`#error` for unsupported architectures, so that we do not forget to
think about pointer handling when porting to a new architecture.
Fixes#15290FixesSerenityOS/ladybird#56
We were dropping the base URL path components in the resulting URL due
to mistakenly determining the input URL to start with a Windows drive
letter. Fix this, add a spec link, and a test.
A StringView is sufficient here. This also removes the declaration of
fuzzy_match_recursive from the header, as it's only needed from within
the implementation file.
LLVM 15 switched around what it's basing its `nullptr_t` definitions on,
it's now defining `std::nullptr_t` using `::nullptr_t` instead of the
other way around.
Work around any errors that result from that by just defining it both in
the global namespace as well as in `std` ourselves.
I was very confused why I was getting "no key named `foo`" errors, so
hopefully this will save someone that confusion in the future. :^)
(It'll probably be me again...)
This was present in Vector already. Clang-format fixed some const
positions automatically too.
Also removed a now-ambiguous and unnecessary constructor from Shell.
This is a set of functions that allow you to convert between arbitrary
IEEE 754 floating point types, as long as they can be represented
within 64 bits. Conversion methods between floats and doubles are
provided, as well as a generic `float_to_float()`.
Example usage:
#include <AK/FloatingPoint.h>
double val = 1.234;
auto weird_f16 =
convert_from_native_double<FloatingPointBits<0, 6, 10>>(val);
Signed and unsigned floats are supported, and both NaN and +/-Inf are
handled correctly. Values that do not fit in the target floating point
type are clamped.
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:
- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable
This patch renames the Kernel classes so that they can coexist with
the original AK classes:
- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable
The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
The commit that introduced BuiltinWrappers (548529a) accidentally used
`val` instead of `value` in the non `__GNUC__` and `__clang__` versions
of the functions.
That this did not already happen took me by surprise, as for
most other similar containers/types in AK (e.g. Span) the index
will be checked. This check not happening could easily let
off-by-one indexing errors slip through the cracks.
This can almost be identical to the Linux version, except that the
`pthread_attr_t` object is populated using a call to
`pthread_attr_get_np` instead of `pthread_getattr_np`.
FreeBSD also needs `pthread_atttr_t` to be initialized using
`pthread_attr_init` instead of zero-initialization, but it's the
technically correct thing to do on Linux as well.
This constructor relied on running strlen implicitly on its argument,
thereby potentially causing out-of-bound reads (some of which were
caught a few days ago). The removal of this constructor ensures that the
caller must explicitly pass the size of the string by either:
1) Using operator""sv on literal strings; or
2) Calling strlen explicitly, making it clear that the size of the view
is being calculated at runtime.
During the removal of StringView(char const*), all users of these
functions were removed, and they are of dubious value (relying on
implicit StringView conversion).
This prevents us from needing a sv suffix, and potentially reduces the
need to run generic code for a single character (as contains,
starts_with, ends_with etc. for a char will be just a length and
equality check).
No functional changes.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
Error::from_string_literal now takes direct char const*s, while
Error::from_string_view does what Error::from_string_literal used to do:
taking StringViews. This change will remove the need to insert `sv`
after error strings when returning string literal errors once
StringView(char const*) is removed.
No functional changes.
This commit moves the length calculations out to be directly on the
StringView users. This is an important step towards the goal of removing
StringView(char const*), as it moves the responsibility of calculating
the size of the string to the user of the StringView (which will prevent
naive uses causing OOB access).
This makes the assumption that we never pass a stack-allocated char
array to CheckedFormatString arguments (dbgln, outln, warnln). This
assumption seems to hold true for the current state of Serenity code, at
least. :^)
Previously we would treat the empty string as `null`. This caused
JavaScript like this to fail:
```js
var object = {};
try {
object = JSON.parse("");
} catch {}
var array = object.array || [];
```
Since `JSON.parse("")` returned null instead of throwing, it would set
`object` to null and then try and use it instead of using the default
backup value.
Is it another great upgrade to our PNG encoder like in 9aafaec259?
Well, not really - it's not a 2x or 55x improvement like you saw there,
but still it saves something:
- a screenshot of a blank Serenity desktop dropped from about 45 KiB
to 40 KiB.
- re-encoding NASA photo of the Earth to PNG again saves about 25%
(16.5 MiB -> 12.3 MiB), compared to not using filters.
[1]: https://commons.wikimedia.org/wiki/File:The_Blue_Marble_(remastered).jpg
This commit has no behavior changes.
In particular, this does not fix any of the wrong uses of the previous
default parameter (which used to be 'false', meaning "only replace the
first occurence in the string"). It simply replaces the default uses by
String::replace(..., ReplaceMode::FirstOnly), leaving them incorrect.
ByteBuffer::get_bytes_for_writing() was only ensuring capacity before
this patch. The method needs to call resize to register the appended
data, otherwise it will be overwritten with next data addition.