Implements the corresponding encoders, selects the appropriate one when
encoding URL search params. If an encoder for the given encoding could
not be found, fallback to utf-8.
Parsing last as an IPV4 number was not returning true in "ends with a
number" as the parsing of that part was overflowing. This means that the
URL is not considered to be an IPv4 address, and is treated as a valid
domain.
Helpfully, the spec also points out in a note that this step is
equivalent to simply checking that the last part ends with 0x followed
by only hex digits - which doesn't suffer from any overflow problem!
Arguably this is an editorial issue in the spec where this should be
clarified a little bit. But for now, fixing this fixes 3 sub tests in
WPT for:
https://wpt.live/url/url-constructor.any.html
Our heuristic was a bit too simplistic and would not run through the
ToASCII unicode algorithm which performs some extra validation. This
would cause invalid URLs that should fail to be parsed be mistakenly
accepted.
This fixes 8 tests in: https://wpt.live/url/url-constructor.any.html
We can't simply use the base URL as it may need to be modified in some
form. For example - for the included test, the fragment was previously
being included in the resulting URL.
This fixes 1 test on https://wpt.live/url/url-constructor.any.html
There were some extra steps in there which produced wrong results for
relative file URLs.
Fixes 7 test cases in: https://wpt.live/url/url-constructor.any.html
We also need to adjust the test results in TestURL. The behaviour tested
does not match how URL is specified to work as an absolute relative is
given.
The check for:
```
if (start_index >= end_index)
return {};
```
To prevent an out of bounds when trimming the start and end of the input
of whitespace was preventing valid URLs (only having whitespace in the
input) from being parsed.
Instead, prevent start_index from ever getting above end_index in the
first place, and don't treat empty inputs as an error.
Fixes one WPT test on:
https://wpt.live/url/url-constructor.any.html
This doesn't seem like something we will neccessarily do in the URL
class anyway due to performance reasons - unless strictly needed (like
for the DOMURL implementation).
This patch moves the data members of URL to an internal URL::Data struct
that is also reference-counted. URL then uses a CopyOnWrite<T> template
to give itself copy-on-write behavior.
This means that URL itself is now 8 bytes per instance, and copying is
cheap as long as you don't mutate.
This shrinks many data structures over in LibWeb land. As an example,
CSS::ComputedValues goes from 3024 bytes to 2288 bytes per instance.
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.
This change has two main benefits:
* Moving AK back more towards being an agnostic library that can
be used between the kernel and userspace. URL has never really fit
that description - and is not used in the kernel.
* URL _should_ depend on LibUnicode, as it needs punnycode support.
However, it's not really possible to do this inside of AK as it can't
depend on any external library. This change brings us a little closer
to being able to do that, but unfortunately we aren't there quite
yet, as the code generators depend on LibCore.