The previous implementation created invalid HTTP requests in cases
where the request method was POST or when the request contained a
body. There were two bugs for these cases:
1) the 'Content-Type' header was sent twice
2) a stray CRLF was appended to the request
In order to follow spec text to achieve this, we need to change the
underlying representation of a host in AK::URL to deserialized format.
Before this, we were parsing the host and then immediately serializing
it again.
Making that change resulted in a whole bunch of fallout.
After this change, callers can access the serialized data through
this concept-host-serializer. The functional end result of this
change is that IPv6 hosts are now correctly serialized to be
surrounded with '[' and ']'.
This now defaults to serializing the path with percent decoded segments
(which is what all callers expect), but has an option not to. This fixes
`file://` URLs with spaces in their paths.
The name has been changed to serialize_path() path to make it more clear
that this method will generate a new string each call (except for the
cannot_be_a_base_url() case). A few callers have then been updated to
avoid repeatedly calling this function.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
`index + 1` was not correct. For example, if the body has two bytes, we
would consume the first byte and increment the index. We then add one
to the index and see it's equal to the size, so we take this one byte
and set the body result to it. The while loop would still continue and
we consume the second byte, adding it to the temporary buffer. We see
that the index is above the size, so we don't update the body, dropping
the last byte on the floor.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
AK::URL stores the URL query string already encoded. We were sending
double-encoded query strings, which is why the Google cookie consent
page was not working correctly.
A change was made prior to percent encode plus signs in order to fix an
issue with the Google cookie consent page.
Unforunately, this was treating a symptom of a problem and not the root
cause and is incorrect behavior.
Adds a new optional parameter 'reserved_chars' to
AK::URL::percent_encode. This new optional parameter allows the caller
to specify custom characters to be percent encoded. This is then used
to percent encode plus signs by HttpRequest::to_raw_request.
This commit converts TLS::TLSv12 to a Core::Stream object, and in the
process allows TLS to now wrap other Core::Stream::Socket objects.
As a large part of LibHTTP and LibGemini depend on LibTLS's interface,
this also converts those to support Core::Stream, which leads to a
simplification of LibHTTP (as there's no need to care about the
underlying socket type anymore).
Note that RequestServer now controls the TLS socket options, which is a
better place anyway, as RS is the first receiver of the user-requested
options (though this is currently not particularly useful).
This makes connections (particularly TLS-based ones) do the handshaking
stuff only once.
Currently the cache is configured to keep at most two connections evenly
balanced in queue size, and with a grace period of 10s after the last
queued job has finished (after which the connection will be dropped).
This patch adds two new static methods to HttpRequest.
get_http_basic_authentication_header generates a "Authorization" header
from a given URL, where as parse_http_basic_authentication_header parses
an "Authorization" header into username and password.
This percent encodes/decodes the request URI when creating or parsing
raw HTTP requests. This is necessary because AK::URL now contains
percent decoded data, meaning we have to re-encode it for creating
raw requests.
When the path component of the request URL was empty we'd end up
sending requests like "GET HTTP/1.1" (note the missing /). This
ensures that we always send a path.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.