Now that we're using the new HTML parser, we don't have to do the weird
"run the script when inserted into the document, uhh, or when the text
content of the script element changes" dance.
Instead, we just follow the spec, and scripts run the way they should.
The only remaining client of the old parser is the fragment parser used
by the Element.innerHTML setter. We'll need to implement a bit more
stuff in the new parser before we can switch that over.
When handling a "textarea" start tag, we have to ignore the next token
if it's an LF ('\n'). However, we were not switching the tokenizer
state before fetching the lookahead token, and this caused us to force
the tokenizer into the RCDATA state too late, effectively getting it
stuck in that state for way longer than it should be.
Fixes#2508.
Now that we flush characters in a single place, we can call the Text's
children_changed() from there instead of having a goofy targeted hack
for <style> elements. :^)
Instead of appending character-at-a-time, we now buffer character
insertions in a StringBuilder, and flush them to the relevant node
whenever we start inserting into a new node (and when parsing ends.)
Last night I tried making a little test page that had a bunch of <img>
elements and nothing else. It didn't work.
Fix this by correctly adding a synthesized <html> element to the
document if we get something else in the "before html insertion mode.
You can still run the old parser with "br -O", but the new one is good
enough to be the default parser now. We'll fix issues as we go and
eventually remove the old one completely. :^)
This patch adds two script lists to Document:
- Scripts to execute when parsing has finished
- Scripts to execute as soon as possible
Since we don't actually load scripts asynchronously yet (we just do a
synchronous load when parsing the <script> element for simplicity),
these are already loaded by the time we get to "The end" of parsing.
If we have an <a> element on the list of active formatting elements
when hitting another "a" start tag, that's a parse error. Recover by
using the AAA.
I've been using this in the new HTML parser and it makes it much easier
to understand the state of unfinished code branches.
TODO() is for places where it's okay to end up but we need to implement
something there.
ASSERT_NOT_REACHED() is for places where it's not okay to end up, and
something has gone wrong.