Commit graph

94 commits

Author SHA1 Message Date
Shannon Booth
d200704b6d LibWeb: Add Last-Modified header to file and resource requests 2024-04-02 07:51:02 +02:00
Andreas Kling
3d78f86a2f LibWeb: Don't ask RequestServer to prefetch DNS for file: and data: URLs 2024-03-26 11:39:45 +01:00
Timothy Flynn
24ecf31ff5 LibURL+LibWeb: Move data URL processing to LibWeb's fetch infrastructure
This is a fetching AO and is only used by LibWeb in the context of fetch
tasks. Move it to LibWeb with other fetch methods.

The main reason for this is that it requires the use of other LibWeb AOs
such as the forgiving Base64 decoder and MIME sniffing. These AOs aren't
available within LibURL.
2024-03-25 08:13:27 +01:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Shannon Booth
9ce8189f21 Everywhere: Use unqualified AK::URL
Now possible in LibWeb now that there is no longer a Web::URL.
2024-02-25 08:54:31 +01:00
Andrew Kaster
e7daa02bf2 LibWeb: Unblock port 9000
This was blocked because it can be used for cross-protocol attacks on
some network printers. However, it's also used by the web platform
tests. One can argue that getting WPT working is more important than
theoretical attacks on poorly configured printers.
2024-02-12 11:43:22 -07:00
Bastiaan van der Plaat
cde14901bc Ladybird+LibWeb: Add initial about:version internal page 2024-01-13 13:41:09 -05:00
Bastiaan van der Plaat
05c0640474 Ladybird+LibWeb: Add about scheme support for internal pages 2024-01-13 13:41:09 -05:00
Andreas Kling
bf5ad56085 LibWeb: Ignore preconnect requests for file: and data: URLs
I noticed while debugging a fully downloaded page that it was trying
to preconnect to a file:// host. That doesn't make any sense, so let's
add a tiny bit of logic to ignore preconnect requests for file: and
data: URLs.
2023-12-30 13:49:50 +01:00
Bastiaan van der Plaat
f8feca5d21 LibWeb: Use directory page when viewing a resource schemed directory URL 2023-12-27 10:54:07 -05:00
Timothy Flynn
e511a264fe LibWeb: Implement ad-hoc steps to allow LibWeb to load resource:// URLs
The resource:// scheme is used for Core::Resource files. Currently, any
users of resource:// URLs in Ladybird must manually create the Resource
and extract its data. This will allow for passing the resource:// URL
along for LibWeb to handle.
2023-12-24 14:09:23 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Aliaksandr Kalenik
6ac43274b2 LibWeb+LibJS: Use JS::GCPtr for pointers to GC-allocated objects
Fixes warnings found by LibJSGCVerifier
2023-12-11 16:55:25 +01:00
Andrew Kaster
04670c7a06 LibWeb: Hide load started/completed debug messages behind SPAM_DEBUG 2023-12-08 20:04:13 -05:00
Shannon Booth
6aff55d655 LibWeb: Port NavigatorID from DeprecatedString to String 2023-11-20 15:00:19 +01:00
Andrew Kaster
ea95256f83 LibWeb: Remove unused ResourceLoader::load(URL) overload
With the refactoring of Workers, nobody is calling this LoadRequest-less
overload. All new code should eventually be moved to fetch anyway.
2023-11-15 12:56:33 +01:00
Andrew Kaster
d7d84ee931 LibWeb: Ensure a Web::Page is associated with local Worker LoadRequests
This is a hack on top of a hack because Workers don't *really* need to
have a Web::Page at all, but the ResourceLoader infra that should be
going away soon ™️ is not quite ready to axe that requirement for
cookies.
2023-11-15 12:56:33 +01:00
Aliaksandr Kalenik
b9e0ad4358 LibWeb: Make ResourceLoader pass body and headers in error callback
Pass body and headers of a failed request to callback so caller can
process them.
2023-10-03 09:41:56 +02:00
Bastiaan van der Plaat
8f2319e966 Ladybird+LibWeb: Rename FileDirectoryLoader to GeneratedPagesLoader 2023-09-24 19:59:00 -06:00
Bastiaan van der Plaat
eafdb06d87 LibWeb: Add directory entries page when visiting a local directory 2023-08-15 10:41:54 +01:00
auipc
1653c5ea41 LibWeb: Use current platform for navigator.platform
Before, navigator.platform would always report the platform as "Serenity
OS", regardless of whether or not that was true. It also did not include
the architecture, which Firefox and Chrome both do. Now, it can report
either "Linux x86_64" or "SerenityOS AArch64".
2023-08-13 05:13:18 +02:00
Karol Kosek
eb41f0144b AK: Decode data URLs to separate class (and parse like every other URL)
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.

Because parsing 'data:' didn't use standard fields, running the
following JS code:

    new URL('#a', 'data:text/plain,hello').toString()

not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.

With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
2023-08-01 14:19:05 +02:00
Karol Kosek
f27b9b9563 LibWeb: Set Content-Type for data: URLs instead of checking MIME on load
This makes the loader more agnostic.

Additionally, this allows us to load tab in Ladybird with a 'data:' URL
containing parameters, as a Resource will now call
`mime_type_from_content_type` to extract the content type from MIME. :^)
2023-08-01 14:19:05 +02:00
Daniel
d8c1150f6b LibWeb: Respect "no-store" directive in cache-control header 2023-06-22 21:24:23 +02:00
MacDue
35612c6a7f AK+Everywhere: Change URL::path() to serialize_path()
This now defaults to serializing the path with percent decoded segments
(which is what all callers expect), but has an option not to. This fixes
`file://` URLs with spaces in their paths.

The name has been changed to serialize_path() path to make it more clear
that this method will generate a new string each call (except for the
cannot_be_a_base_url() case). A few callers have then been updated to
avoid repeatedly calling this function.
2023-04-15 06:37:04 +02:00
Timothy Flynn
69e8216f2c LibWeb: Do not use OS error codes in the error callback for file:// URLs
The error code passed here is expected to be an HTTP error code. Passing
errno codes does not make sense in that context.
2023-04-04 22:41:20 +01:00
Andreas Kling
652676fdc1 LibWeb: Make ResourceLoader insert a Content-Type header for file://
We make a guess using the MIME type guessing API in LibCore. This frees
clients of this code from having to do the guessing.
2023-03-22 23:34:32 +00:00
Andreas Kling
20132da88d LibWeb: Don't treat erroring subresource loads as success
If a subresource fails to load, we don't care that we got some custom
404 page. The subresource should still be considered failed.

This is an ad-hoc solution that unbreaks Acid2. This code will
eventually be replaced by fetch mechanisms.
2023-03-15 23:29:00 +01:00
Andreas Kling
3435820e1f LibWeb: Render HTML content if present for HTTP error pages
If an HTTP response fails with an error code (e.g 403) but still has
body content, we now render the content.

We only fall back to our own built-in error page if there's no body.
2023-02-24 19:15:49 +01:00
Tim Schumacher
606a3982f3 LibCore: Move Stream-based file into the Core namespace 2023-02-13 00:50:07 +00:00
Tim Schumacher
d43a7eae54 LibCore: Rename File to DeprecatedFile
As usual, this removes many unused includes and moves used includes
further down the chain.
2023-02-13 00:50:07 +00:00
Timothy Flynn
4a916cd379 Everywhere: Remove needless copies of Error / ErrorOr instances
Either take the underlying objects with release_* methods or move() the
instances around.
2023-02-10 09:08:52 +00:00
Timothy Flynn
96f409ec1e LibWeb+WebContent: Do not reference-count file request objects
There is currently a memory leak with these file request objects due to
the callback on_file_request_finish referencing itself in its capture
list. This object does not need to be reference counted or allocated on
the heap. It is only ever stored in a HashMap until a response is
received from the browser, and it is not shared.
2023-02-01 14:04:44 +00:00
Luke Wilde
6d188d72c0 LibWeb: Store cookies for every HTTP response
As per Fetch, we are supposed to store cookies from Set-Cookie as soon
as we receive response headers for any HTTP response, even in error
cases.

Required by Twitter to login, as it sets cookies via XHR.
2022-12-30 21:56:54 -05:00
Tim Schumacher
ed4c2f2f8e LibCore: Rename Stream::read_all to read_until_eof
This generally seems like a better name, especially if we somehow also
need a better name for "read the entire buffer, but not the entire file"
somewhere down the line.
2022-12-12 14:16:42 +01:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
MacDue
8a5d2be617 Everywhere: Remove unnecessary mutable attributes from lambdas
These lambdas were marked mutable as they captured a Ptr wrapper
class by value, which then only returned const-qualified references
to the value they point from the previous const pointer operators.

Nothing is actually mutating in the lambdas state here, and now
that the Ptr operators don't add extra const qualifiers these
can be removed.
2022-11-19 14:37:31 +00:00
Andrew Kaster
828441852f Everywhere: Replace uses of __serenity__ with AK_OS_SERENITY
Now that we have OS macros for essentially every supported OS, let's try
to use them everywhere.
2022-10-10 12:23:12 +02:00
Linus Groh
398e44b27b LibWeb: Pass status code to ResourceLoader error callback when available 2022-10-05 09:12:59 +01:00
networkException
4230dbbb21 AK+Everywhere: Replace "protocol" with "scheme" url helpers
URL had properly named replacements for protocol(), set_protocol() and
create_with_file_protocol() already. This patch removes these function
and updates all call sites to use the functions named according to the
specification.

See https://url.spec.whatwg.org/#concept-url-scheme
2022-09-29 09:39:04 +01:00
Andreas Kling
9567e211e7 LibWeb+WebContent: Add abstraction layer for event loop and timers
Instead of using Core::EventLoop and Core::Timer directly, LibWeb now
goes through a Web::Platform abstraction layer instead.

This will allow us to plug in Qt's event loop (and QTimer) over in
Ladybird, to avoid having to deal with multiple event loops.
2022-09-07 20:30:31 +02:00
Andreas Kling
c964a6b548 LibWeb: Paper over a VERIFY() crash in ResourceLoader for now 2022-07-17 14:11:36 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Kenneth Myhra
92a3803066 LibWeb: Add timeout_callback to ResourceLoader::load() 2022-07-03 13:26:32 +02:00
Kenneth Myhra
07b6c7114b LibWeb: Use a single shot timer instead of an ordinary repetitive timer 2022-07-03 13:26:32 +02:00
Lucas CHOLLET
662711fa26 Browser+LibWeb+WebContent: Allow Browser to load local files
To achieve this goal:
 - The Browser unveils "/tmp/portal/filesystemaccess"
 - Pass the page through LoadRequest => ResourceLoader
 - ResourceLoader requests a file to the FileSystemAccessServer via IPC
 - OutOfProcessWebView handles it and sends a file descriptor back to
 the Page.
2022-06-27 20:22:15 +01:00
Andreas Kling
c03a0e7260 LibWeb: Fix unsafe capture of ref-to-local when setting up load timeout
We were capturing a reference to a stack local and then persisting the
closure, causing it to dereference a long-gone object when invoked.
2022-06-23 20:37:29 +02:00
Kenneth Myhra
c805987329 LibWeb: Add timeout functionality to ResourceLoader
Add timeout functionality to ResourceLoader and use it from
XMLHttpRequest.
2022-06-21 10:29:14 +01:00
Luke Wilde
210c3795f9 LibWeb: Apply content filter to DNS prefetch and pre-connect
Performing DNS prefetch or pre-connect on filtered URLs is wasteful,
as we would block any actual use further down the line.

A bunch of websites perform DNS prefetch and/or pre-connect to trackers
as well, for example:
```
prefetch DNS for 'https://adserver-us.adtech.advertising.com/'
prefetch DNS for 'https://secure.adnxs.com/'
prefetch DNS for 'https://bidder.criteo.com/'
prefetch DNS for 'https://static.criteo.net/'
prefetch DNS for 'https://cdn.krxd.net/'
prefetch DNS for 'https://widgets.outbrain.com/'
prefetch DNS for 'https://images.outbrain.com/'
prefetch DNS for 'https://log.outbrain.com/
prefetch DNS for 'https://amplifypixel.outbrain.com/'
prefetch DNS for 'https://odb.outbrain.com/'
prefetch DNS for 'https://js-sec.indexww.com/'
prefetch DNS for 'https://as-sec.casalemedia.com/'
prefetch DNS for 'https://as.casalemedia.com/'
prefetch DNS for 'https://sofia.trustx.org/'
prefetch DNS for 'https://c.amazon-adsystem.com/'
prefetch DNS for 'https://s.amazon-adsystem.com/'
prefetch DNS for 'https://aax.amazon-adsystem.com/'
prefetch DNS for 'https://t.teads.tv/'
prefetch DNS for 'https://beacon.krxd.net/'
pre-connect to 'https://www.google-analytics.com/'
pre-connect to 'https://www.googletagmanager.com/'
```
2022-06-10 12:15:37 +01:00