Commit graph

4 commits

Author SHA1 Message Date
Max Wipfli
f808279769 LibWeb: Implement encoding sniffing algorithm
This patch implements the HTML specification's "encoding sniffing
algorithm", which is used when no encoding can be obtained from the
Content-Type header (either because it doesn't contain a charset=...)
value or the file has not been opened via HTTP (as with local files).

It also modifies the creator of the HTMLDocumentParser to use the new
HTMLDocumentParser::create_with_uncertain_encoding static method, which
runs the encoding sniffing algorithm before instantiating the parser.

This now allows us to load local HTML pages (or remote pages without a
charset specified in the 'Content-Type' header) with a non-UTF-8
encoding such as 'windows-1252'. This would previously crash the
browser. :^)
2021-05-18 21:02:07 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Luke
b82a00d657 LibWeb: Use the new "ensure_pre_insertion_validity" in the HTML document parser
Previously we didn't check if we could insert the element in the
adjusted insertion location's parent.

Also makes the return type NonnullRefPtr, as that's what element is.
2021-04-06 21:42:00 +02:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Renamed from Libraries/LibWeb/HTML/Parser/HTMLDocumentParser.h (Browse further)