Because we still support u64 and i64 (on top of i32 and u32) we do still
have to parse the number ourself first. Then if we determine that the
number is a floating point or is outside of the range of i64 and u64 we
fallback and parse it as a double.
Before JsonParser had ifdefs guarding the double computation, but it
just build when we error on ifdef KERNEL so JsonParser is no longer
usable in the Kernel. This can be remedied fairly easily but since
it is not needed we #error on that for now.
These are guarded with #ifndef KERNEL, since doubles (and floats) are
not allowed in KERNEL mode.
In StringUtils there is convert_to_floating_point which does have a
template parameter incase you have a templated type.
Similar to decimal floating point parsing the current strtod hex float
parsing gives a lot of incorrect results. We can use a similar technique
as with decimal parsing however hex floats are much simpler as we don't
need to scale with a power of 5.
For hex floats we just provide the parse_first_hexfloat API as there is
currently no need for a parse_hexfloat_completely API.
Again the accepted input for parse_first_hexfloat is very lenient and
any validation should be done before calling this method.
This is based on the paper by Daniel Lemire called
"Number parsing at a Gigabyte per second", currently available at
https://arxiv.org/abs/2101.11408
An implementation can be found at
https://github.com/fastfloat/fast_float
To support both strtod like methods and String::to_double we have two
different APIs. The parse_first_floating_point gives back both the
result, next character to read and the error/out of range status.
Out of range here means we rounded to infinity 0.
The other API, parse_floating_point_completely, will return a floating
point only if the given character range contains just the floating point
and nothing else. This can be much faster as we can skip actually
computing the value if we notice we did not parse the whole range.
Both of these APIs support a very lenient format to be usable in as many
places as possible. Also it does not check for "named" values like
"nan", "inf", "NAN" etc. Because this can be different for every usage.
For integers and small values this new method is not faster and often
even a tiny bit slower than the current strtod implementation. However
the strtod implementation is wrong for a lot of values and has a much
less predictable running time.
For correctness this method was tested against known string -> double
datasets from https://github.com/nigeltao/parse-number-fxx-test-data
This method gives 100% accuracy.
The old strtod gave an incorrect value in over 50% of the numbers
tested.
By appending individual bytes as code points, we were "breaking apart"
multi-byte UTF-8 code points. This now behaves the same way as the
invert_case() helper in StringUtils.
I'm not sure there's a material improvement from this patch. However,
I've been reading the error handling code from multiple projects and
was excited to see Serenity being able to handle assignment
(`auto x = TRY(make_x())`) the same way as actions (`TRY(do_x())`).
I think it's worth documenting that this is only possible due to
non-standard extensions.
This lets us remove a glob pattern from LibC, the DynamicLoader, and,
later, Lagom. The Kernel already has its own separate list of AK files
that it wants, which is only a subset of all AK files.
This prevents an ICE with GCC trying to declare e.g. Variant<String&>.
Using a concept is a bit overkill here, but clang otherwise trips over
the friendship declaration to other Variant types:
template<typename... NewTs>
friend struct Variant;
Without using a concept, clang believes this is re-declaring the Variant
type with differing requirements ("error: requires clause differs in
template redeclaration").
Even though this almost certainly wouldn't run properly even if we had
a working kernel for AARCH64 this at least lets us build all the
userland binaries.
WebDriver aims to implement the WebDriver specification found at
https://w3c.github.io/webdriver/webdriver-spec.html . It's an HTTP
server that can create Browser sessions and control them.
Co-authored-by: Florent Castelli <florent.castelli@gmail.com>
If the entire string you want to right-trim consists of characters you
want to remove, we previously would incorrectly leave the first
character there.
For example: `trim("aaaaa", "a")` would return "a" instead of "".
We can't use `i >= 0` in the loop since that would fail to detect
underflow, so instead we keep `i` in the range `size .. 1` and then
subtract 1 from it when reading the character.
Added some trim() tests while I was at it. (And to confirm that this was
the issue.)
Instead of doing anything reasonable, Utf8CodePointIterator returned
invalid code points, for example U+123456. However, many callers of this
iterator assume that a code point is always at most 0x10FFFF.
In fact, this is one of two reasons for the following OSS Fuzz issue:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49184
This is probably a very old bug.
In the particular case of URLParser, AK::is_url_code_point got confused:
return /* ... */ || code_point >= 0xA0;
If code_point is a "code point" beyond 0x10FFFF, this violates the
condition given in the preceding comment, but satisfies the given
condition, which eventually causes URLParser to crash.
This commit fixes *only* the erroneous UTF-8 decoding, and does not
fully resolve OSS-Fuzz#49184.
In particular, StringView::contains(char) is often used with a u32
code point. When this is done, the compiler will for some reason allow
data corruption to occur silently.
In fact, this is one of two reasons for the following OSS Fuzz issue:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49184
This is probably a very old bug.
In the particular case of URLParser, AK::is_url_code_point got confused:
return /* ... */ || "!$&'()*+,-./:;=?@_~"sv.contains(code_point);
If code_point is a large code point that happens to have the correct
lower bytes, AK::is_url_code_point is then convinced that the given
code point is okay, even if it is actually problematic.
This commit fixes *only* the silent data corruption due to the erroneous
conversion, and does not fully resolve OSS-Fuzz#49184.
GCC seems to get tripped up over this inheritance when converting from
an ErrorOr<StringView> to the partially specialized ErrorOr<void>. See
the following snippet:
NEVER_INLINE ErrorOr<StringView> foo()
{
auto string = "abc"sv;
outln("{:p}", string.characters_without_null_termination());
return string;
}
NEVER_INLINE ErrorOr<void> bar()
{
auto string = TRY(foo());
outln("{:p}", string.characters_without_null_termination());
VERIFY(!string.starts_with('#'));
return {};
}
int main()
{
MUST(bar());
}
On some machines, bar() will contain a StringView whose pointer has had
its upper bits set to 0:
0x000000010cafd6f8
0x000000000cafd6f8
I'm not 100% clear on what's happening in the default-generated Variant
destructor that causes this. Probably worth investigating further.
The error would also be alleviated by making the Variant destructor
virtual, but rather than that, let's make ErrorOr simply contain a
Variant rather than inherit from it.
Fixes#15449.
Until now, VERIFY() failures would just cause a __builtin_trap() in
release builds, which made them a bit too harsh. This commit adds an
out-of-line helper function that prints the error before trapping.
Doesn't use them in libc headers so that those don't have to pull in
AK/Platform.h.
AK_COMPILER_GCC is set _only_ for gcc, not for clang too. (__GNUC__ is
defined in clang builds as well.) Using AK_COMPILER_GCC simplifies
things some.
AK_COMPILER_CLANG isn't as much of a win, other than that it's
consistent with AK_COMPILER_GCC.
Clang patch D116203 added various builtin functions for type traits,
`__decay` being one of them. This name conflicts with our
`AK::Detail::__decay`, leading to compiler warnings with Clang 16.
This is the initial port of Lagom to win32. This will enable developers
to use Lagom as an alternative to vanilla STL/StandardC++Library - which
gives a much richer environment (think QtCore - but modern).
My main incentive - is to have a native Windows Ladybird working.
I am starting with AK, which does not yet fully compile (on mingw). When
AK is compiling (currently fails building StringBuffer.cpp) - I will
continue to LibCore and then the rest of the user space libraries
(excluding the GUI, which will be another different effort).
Most of the code is happily stollen from Andrew Kaster's fork - he
deserves the credit.
Co-authored-by: Andrew Kaster <akaster@serenityos.org>
URL had properly named replacements for protocol(), set_protocol() and
create_with_file_protocol() already. This patch removes these function
and updates all call sites to use the functions named according to the
specification.
See https://url.spec.whatwg.org/#concept-url-scheme
This code generator no longer creates JS wrappers for platform objects
in the old sense, instead they're JS objects internally themselves.
Most of what we generate now are prototypes - which can be seen as
bindings for the internal C++ methods implementing getters, setters, and
methods - as well as object constructors, i.e. bindings for the internal
create_with_global_object() method.
Also tweak the naming of various CMake glue code existing around this.
JS::Value stores 48 bit pointers to separately allocated objects in its
payload. On x86-64, canonical addresses have their top 16 bits set to
the same value as bit 47, effectively meaning that the value has to be
sign-extended to get the pointer. AArch64, however, expects the topmost
bits to be all zeros.
This commit gates sign extension behind `#if ARCH(X86_64)`, and adds an
`#error` for unsupported architectures, so that we do not forget to
think about pointer handling when porting to a new architecture.
Fixes#15290FixesSerenityOS/ladybird#56
We were dropping the base URL path components in the resulting URL due
to mistakenly determining the input URL to start with a Windows drive
letter. Fix this, add a spec link, and a test.
A StringView is sufficient here. This also removes the declaration of
fuzzy_match_recursive from the header, as it's only needed from within
the implementation file.
LLVM 15 switched around what it's basing its `nullptr_t` definitions on,
it's now defining `std::nullptr_t` using `::nullptr_t` instead of the
other way around.
Work around any errors that result from that by just defining it both in
the global namespace as well as in `std` ourselves.
I was very confused why I was getting "no key named `foo`" errors, so
hopefully this will save someone that confusion in the future. :^)
(It'll probably be me again...)
This was present in Vector already. Clang-format fixed some const
positions automatically too.
Also removed a now-ambiguous and unnecessary constructor from Shell.