0ct0pu5/ladybird

Author	SHA1	Message	Date
Luke	08221139a5	test-web: Add ability to change page mid-test This allows you to not have to write a separate test file for the same thing but in a different situation. This doesn't handle when you change the page with location.href however. Changes the name of the page load handlers to prevent confusion with this.	2020-07-25 12:35:15 +02:00
Andreas Kling	6e02ef19d1	LibWeb: Add a helper for creating a fake (start tag) HTML token Sometimes the parsing rules say we need to insert a fake HTML token. Let's have a convenient way of doing that!	2020-07-23 17:31:08 +02:00
Luke	201cc1bfcc	LibWeb: Assert we're parsing a fragment on fragment cases The specification says that parts labelled as a "fragment case" will only occur when parsing a fragment. It says that if it occurs when not parsing a fragment, then it is a specification error. We should probably assume at this point that it's an implementation error. This fixes a few little mistakes that were caught out by this. Also moves the context element outside insertion mode reset, as other (unimplemented) parts refer to it, such as "adjusted current node". Also cleans up insertion mode reset.	2020-07-22 00:02:40 +02:00
Luke	19d6884529	LibWeb: Implement quirks mode detection This allows us to determine which mode to render the page in. Exposes "doctype" and "compatMode" on Document. Exposes "name", "publicId" and "systemId" on DocumentType.	2020-07-21 01:08:32 +02:00
Andreas Kling	92d831c25b	LibWeb: Implement fragment parsing and use it for Element.innerHTML This patch implements most of the HTML fragment parsing algorithm and ports Element::set_inner_html() to it. This was the last remaining user of the old HTML parser. :^)	2020-06-26 00:53:25 +02:00
Andreas Kling	07d976716f	LibWeb: Remove most uses of the old HTML parser The only remaining client of the old parser is the fragment parser used by the Element.innerHTML setter. We'll need to implement a bit more stuff in the new parser before we can switch that over.	2020-06-21 22:29:05 +02:00
Andreas Kling	966bc05fef	LibWeb: Implement more of the foster parenting algorithm in the parser	2020-06-21 17:42:00 +02:00
stelar7	5eb39a5f61	LibWeb: Update parser with more insertion modes :^) Implements handling of InHeadNoScript, InSelectInTable, InTemplate, InFrameset, AfterFrameset, and AfterAfterFrameset.	2020-06-21 10:13:31 +02:00
Luke	6532c1e2fa	LibWeb: Implement HTML parser "in column group" insertion mode	2020-06-14 14:07:07 +02:00
Luke	2241b09cd0	LibWeb: Implement HTML parser "in caption" insertion mode	2020-06-14 14:07:07 +02:00
Andreas Kling	c40de9275a	LibWeb: Buffer text node character insertions in the new parser Instead of appending character-at-a-time, we now buffer character insertions in a StringBuilder, and flush them to the relevant node whenever we start inserting into a new node (and when parsing ends.)	2020-06-03 21:53:08 +02:00
Andreas Kling	ca6fbefbc9	LibWeb: Support parsing "select" elements (outside of tables)	2020-05-30 19:58:52 +02:00
Andreas Kling	5818ef2c80	LibWeb: Implement more table-related insertion modes	2020-05-30 18:26:44 +02:00
Andreas Kling	8c96b8174b	LibWeb: Handle AAA situation where there's no formatting element found In this case, we're supposed to return from the AAA and then jump to a different behavior in the "in body" insertion mode. So now we do that.	2020-05-30 17:47:50 +02:00
Andreas Kling	6854f726ce	LibWeb: Improve support for "a" and "li" during "in body" insertion We can now parse welcome.html once again, without resorting to hacks or fallbacks during "in body" :^)	2020-05-30 11:31:49 +02:00
Andreas Kling	68b1bdc234	LibWeb: Add a way to stop the new HTML parser Some things are specced to "stop parsing", which basically just means to stop fetching tokens and jump to "The end"	2020-05-28 18:55:18 +02:00
Andreas Kling	5e53c45113	LibWeb: Plumb content encoding into the new HTML parser We still don't handle non-ASCII input correctly, but at least now we'll convert e.g ISO-8859-1 to UTF-8 before starting to tokenize. This patch also makes "view source" work with the new parser. :^)	2020-05-28 12:35:19 +02:00
Andreas Kling	7aa7a2078f	LibWeb: Parse "td" start tags during "in cell" insertion mode	2020-05-28 11:46:08 +02:00
Andreas Kling	ebb1649a52	LibWeb: Implement more table support in the new HTML parser This is enough to parse the Google front page! (Note: I did have to hack the tokenizer while parsing Google, in order to avoid named character references screwing everything up. We'll fix that too soon enough!)	2020-05-28 00:27:46 +02:00
Andreas Kling	db6cf9b37d	LibWeb: Implement the first half of the Adoption Agency Algorithm The AAA is a somewhat daunting algorithm you have to run for certain tag when inserted inside the <body> element. The purpose of it is to resolve issues with mismatched tags. This patch implements the first half of the AAA. We also move the "list of active formatting elements" to its own class, since it kept accumulating little behaviors. "Marker" entries are now signified by null Element pointers in the list.	2020-05-27 23:22:42 +02:00
Andreas Kling	4c9c6b3a7b	LibWeb: Bring up basic external script execution in the new parser This only works in some narrow cases, but should be enough for our own welcome.html at least. :^)	2020-05-27 23:02:03 +02:00
Andreas Kling	1e30ef239b	LibWeb: Start fleshing out the "in table" parser insertion mode	2020-05-25 20:30:34 +02:00
Andreas Kling	65d8d5e83e	LibWeb: Yet more work towards parsing www/welcome.html :^)	2020-05-24 23:54:22 +02:00
Andreas Kling	45da08a1e6	LibWeb: A whole bunch of work towards spec-compliant <script> elements This is still very unfinished, but there's at least a skeleton of code.	2020-05-24 23:54:22 +02:00
Andreas Kling	5d332c1f11	LibWeb: Parse enough to handle a <style> inside a <head> :^)	2020-05-24 23:54:22 +02:00
Andreas Kling	af8a9331b2	LibWeb: Support comments in the "in head" insertion mode	2020-05-24 23:54:22 +02:00
Andreas Kling	20911efd4d	LibWeb: More work on the HTML parser and tokenizer The parser can now switch the state of the tokenizer! Very webby. :^)	2020-05-24 23:54:22 +02:00
Andreas Kling	31db3f21ae	LibWeb: Start implementing character token parsing Now that we've gotten rid of the misguided character buffering in the tokenizer, it actually spits out character tokens that we have to deal with in the parser. This patch implements enough to bring us back to speed with simple.html	2020-05-24 23:54:22 +02:00
Andreas Kling	53d2f4df70	LibWeb: Factor out the "stack of open elements" into its own class This will allow us to write more expressive parsing code. :^)	2020-05-24 23:54:22 +02:00
Andreas Kling	e44c87cfff	LibWeb: Implement enough HTML parsing to handle a small simple DOM :^) We can now parse a little DOM like this: <!DOCTYPE html> <html> <head></head> <body> <div></div> </body> </html> This is pretty slow work, but the incremental progress is satisfying!	2020-05-24 00:49:22 +02:00
Andreas Kling	fd1b31d0ff	LibWeb: Start building the tree building part of the new HTML parser This patch adds a new HTMLDocumentParser class. It keeps a tokenizer object internally and feeds itself with one token at a time from it. The names and idioms in this class are expressed as closely to the actual HTML parsing spec as possible, to make development as easy and bug free as possible. :^) This is going to become pretty large, but it's pretty cool!	2020-05-24 00:14:23 +02:00

31 commits