Commit graph

13 commits

Author SHA1 Message Date
Shannon Booth
8723f72f0f LibURL: Remove unspecified steps in URL file slash parsing state
There were some extra steps in there which produced wrong results for
relative file URLs.

Fixes 7 test cases in: https://wpt.live/url/url-constructor.any.html

We also need to adjust the test results in TestURL. The behaviour tested
does not match how URL is specified to work as an absolute relative is
given.
2024-08-06 07:58:07 +01:00
Shannon Booth
d6af5bf5eb LibURL: Allow inputs containing only whitespace
The check for:

```
    if (start_index >= end_index)
        return {};
```

To prevent an out of bounds when trimming the start and end of the input
of whitespace was preventing valid URLs (only having whitespace in the
input) from being parsed.

Instead, prevent start_index from ever getting above end_index in the
first place, and don't treat empty inputs as an error.

Fixes one WPT test on:

https://wpt.live/url/url-constructor.any.html
2024-08-05 17:21:26 +01:00
Shannon Booth
4f5af3e90e LibURL: Remove FIXME for stripping c0 control or space
The FIXME was not correct - this was done correctly. But let's use a
helper to make the implementation slightly more readable.
2024-08-05 17:21:26 +01:00
Shannon Booth
670ce3ebb1 LibURL: Remove not particuarly useful NOTE
This doesn't seem like something we will neccessarily do in the URL
class anyway due to performance reasons - unless strictly needed (like
for the DOMURL implementation).
2024-08-05 17:21:26 +01:00
Shannon Booth
41cf9f6fe3 LibURL: Also remove carriage returns from URL input
The definition of an "ASCII tab or newline" also includes U+000D CR.

This fixes 3 subtests in:

https://wpt.live/url/url-constructor.any.html
2024-08-05 11:27:05 +02:00
Shannon Booth
fdf4f1e887 LibURL: Validate for invalid _domain_ code points for non-opaque domains
We were previously not checking for C0 control, U+0025 (%), or U+007F
DELETE.

This makes another good set of URL tests in WPT pass :^)
2024-08-04 18:29:06 +01:00
Shannon Booth
a661daea71 LibURL: Don't consider file:// URL hosts as always opaque
Which was resulting in file URL hosts not being correctly percent
decoded.
2024-08-04 10:37:33 +02:00
Shannon Booth
dd7c720657 LibURL: Remove note about bare-boned URL host parsing
It's not so bare-boned any longer :^)
2024-08-04 10:37:33 +02:00
Andreas Kling
936b76f36e LibURL: Make URL a copy-on-write type
This patch moves the data members of URL to an internal URL::Data struct
that is also reference-counted. URL then uses a CopyOnWrite<T> template
to give itself copy-on-write behavior.

This means that URL itself is now 8 bytes per instance, and copying is
cheap as long as you don't mutate.

This shrinks many data structures over in LibWeb land. As an example,
CSS::ComputedValues goes from 3024 bytes to 2288 bytes per instance.
2024-08-02 20:37:40 +02:00
Tim Ledbetter
1a4b042664 LibURL: Convert ASCII only URLs to lowercase during parsing
This fixes an issue where entering EXAMPLE.COM into the URL bar in the
browser would fail to load as expected.
2024-06-10 20:34:57 -04:00
Andreas Kling
d568b15acf LibURL: Avoid expensive IDNA::to_ascii() for all-ASCII domain strings
20% of CPU usage when loading https://utah.edu/ was spent doing these
ASCII conversions in URL parsing.
2024-04-05 16:01:10 -06:00
Timothy Flynn
576c2f4f4d LibURL+LibUnicode+LibWebView: Handle punycode directly in LibURL
We had defined punycode handling in LibUnicode when LibURL (AK at the
time) was unable to depend on LibUnicode. This is no longer the case.
2024-03-26 12:25:21 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Renamed from AK/URLParser.cpp (Browse further)