0ct0pu5/ladybird

Author	SHA1	Message	Date
Timothy Flynn	93712b24bf	Everywhere: Hoist the Libraries folder to the top-level	2024-11-10 12:50:45 +01:00
0x4261756D	c1a14f66ad	HTMLEncodingDetection: Use mime type in encoding sniffing Also added proper spec comments. Fixes at least one WPT test that was failing previously: https://wpt.live/encoding/single-byte-decoder.window.html?document	2024-10-12 16:14:38 +02:00
MacDue	fc41c282ec	LibWeb: Fix utf16-be check in HTMLEncodingDetection The utf-16be check mistakenly skipped index 3, so was not checking the correct bytes. This meant UTF16-BE files could fail to decode.	2024-01-08 23:35:09 +01:00
MacDue	5e973fca0b	LibWeb: Prevent OOB access in HTMLEncodingDetection for input of '</' Previously, this never checked if `position + 2` was valid. This slightly reorders the loop so all indices are checked. Fixes #22163	2024-01-08 23:35:09 +01:00
Andreas Kling	9ce267944c	LibWeb: Fix crash in HTML encoding detection when handling non-ASCII The fix here was to stop using StringBuilder::append(char) when told to append a code point, and switch to StringBuilder::append_code_point(u32) There's probably a bunch more issues like this, and we should stop using append(char) in general since it allows building of garbage strings.	2023-12-30 13:49:50 +01:00
Andreas Kling	83f43310fa	LibWeb: Add spec comments and fixups to "get an attribute" prescan algo In particular, make some minor adjustments so it flows a little more like the spec.	2023-12-30 13:49:50 +01:00
Ali Mohammad Pur	5e1499d104	Everywhere: Rename {Deprecated => Byte}String This commit un-deprecates DeprecatedString, and repurposes it as a byte string. As the null state has already been removed, there are no other particularly hairy blockers in repurposing this type as a byte string (what it _really_ is). This commit is auto-generated: $ xs=$(ack -l \bDeprecatedString\b\\|deprecated_string AK Userland \ Meta Ports Ladybird Tests Kernel) $ perl -pie 's/\bDeprecatedString\b/ByteString/g; s/deprecated_string/byte_string/g' $xs $ clang-format --style=file -i \ $(git diff --name-only \| grep \.cpp\\|\.h) $ gn format $(git ls-files '.gn' '.gni')	2023-12-17 18:25:10 +03:30
Shannon Booth	3bd04d2c58	LibWeb: Port Attr interface from DeprecatedString to String There are an unfortunate number of DeprecatedString conversions required here, but these should all fall away and look much more pretty again when other places are also ported away from DeprecatedString. Leaves only the Element IDL interface left :^)	2023-09-25 15:39:29 +02:00
Andreas Kling	72c9f56c66	LibJS: Make Heap::allocate<T>() infallible Stop worrying about tiny OOMs. Work towards #20449. While going through these, I also changed the function signature in many places where returning ThrowCompletionOr<T> is no longer necessary.	2023-08-13 15:38:42 +02:00
Kenneth Myhra	50c5f0d7da	LibWeb: Make factory method of DOM::Attr fallible	2023-02-18 00:52:47 +01:00
Linus Groh	57dc179b1f	Everywhere: Rename to_{string => deprecated_string}() where applicable This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.	2022-12-06 08:54:33 +01:00
Linus Groh	6e19ab2bbc	AK+Everywhere: Rename String to DeprecatedString We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)	2022-12-06 08:54:33 +01:00
Linus Groh	fb21271334	LibWeb: Replace incorrect uses of AK::is_ascii_space()	2022-10-02 21:32:49 +02:00
Andreas Kling	530675993b	LibWeb: Rename Attribute to Attr This name is not very good, but it's what the specification calls it.	2022-09-18 02:08:01 +02:00
Andreas Kling	6f433c8656	LibWeb+LibJS: Make the EventTarget hierarchy (incl. DOM) GC-allocated This is a monster patch that turns all EventTargets into GC-allocated PlatformObjects. Their C++ wrapper classes are removed, and the LibJS garbage collector is now responsible for their lifetimes. There's a fair amount of hacks and band-aids in this patch, and we'll have a lot of cleanup to do after this.	2022-09-06 00:27:09 +02:00
sin-ack	c8585b77d2	Everywhere: Replace single-char StringView op. arguments with chars This prevents us from needing a sv suffix, and potentially reduces the need to run generic code for a single character (as contains, starts_with, ends_with etc. for a char will be just a length and equality check). No functional changes.	2022-07-12 23:11:35 +02:00
sin-ack	3f3f45580a	Everywhere: Add sv suffix to strings relying on StringView(char const) Each of these strings would previously rely on StringView's char const constructor overload, which would call __builtin_strlen on the string. Since we now have operator ""sv, we can replace these with much simpler versions. This opens the door to being able to remove StringView(char const*). No functional changes.	2022-07-12 23:11:35 +02:00
Idan Horowitz	086969277e	Everywhere: Run clang-format	2022-04-01 21:24:45 +01:00
Hendiadyoin1	6a95df2526	LibTextCodec: Don't allocate Strings on encoding normalisation This ripples down to LibWeb's HTML and XHR decoders, which therefore become less allocation heavy.	2022-03-21 10:48:17 +01:00
Timothy Flynn	e01dfaac9a	LibWeb: Implement Attribute closer to the spec and with an IDL file Note our Attribute class is what the spec refers to as just "Attr". The main differences between the existing implementation and the spec are just that the spec defines more fields. Attributes can contain namespace URIs and prefixes. However, note that these are not parsed in HTML documents unless the document content-type is XML. So for now, these are initialized to null. Web pages are able to set the namespace via JavaScript (setAttributeNS), so these fields may be filled in when the corresponding APIs are implemented. The main change to be aware of is that an attribute is a node. This has implications on how attributes are stored in the Element class. Nodes are non-copyable and non-movable because these constructors are deleted by the EventTarget base class. This means attributes cannot be stored in a Vector or HashMap as these containers assume copyability / movability. So for now, the Vector holding attributes is changed to hold RefPtrs to attributes instead. This might change when attribute storage is implemented according to the spec (by way of NamedNodeMap).	2021-10-17 13:51:10 +01:00
Andreas Kling	cb895edad4	LibWeb: Move Attribute into the DOM namespace	2021-09-16 01:39:47 +02:00
Luke	e9eae9d880	LibWeb: Add extracting character encoding from a meta content attribute Some Gmail emails contain this.	2021-07-13 20:23:44 +02:00
Max Wipfli	f808279769	LibWeb: Implement encoding sniffing algorithm This patch implements the HTML specification's "encoding sniffing algorithm", which is used when no encoding can be obtained from the Content-Type header (either because it doesn't contain a charset=...) value or the file has not been opened via HTTP (as with local files). It also modifies the creator of the HTMLDocumentParser to use the new HTMLDocumentParser::create_with_uncertain_encoding static method, which runs the encoding sniffing algorithm before instantiating the parser. This now allows us to load local HTML pages (or remote pages without a charset specified in the 'Content-Type' header) with a non-UTF-8 encoding such as 'windows-1252'. This would previously crash the browser. :^)	2021-05-18 21:02:07 +02:00

23 commits