PR #18166 introduced the ability to parse ECC certificates. If we
now fail here the reason is mostlikely something new and we should
prevent this rabbit hole from happening.
Some refactoring of our root ca loading process:
- Remove duplicate code
- Remove duplicate calls to `parse_root_ca`
- Load user imported certificates in Browser/RequestServer
StartingStateHandler and SeekingStateHandler were declaring their own
`bool m_playing` fields (from previous code where there was no base
class).
In the case of SeekingStateHandler, this only made the logging wrong.
For StartingStateHandler, however, this meant that it was not using
the boolean passed as a parameter to the constructor to define the
state that would be transitioned to after the Starting state finished.
This meant that when the Stopping state replaced itself with the
Starting state, playback would not resume when Starting state exits.
This is needed for hit testing the directional arrows on the Street
View office tour, and generally makes SVG hit testing more precise.
Note: The rough bounding box is hit test first, so this should not
be a load more overhead.
This also combines the viewbox mapping into the same transform and
reuses some code by using Path::copy_transformed() rather than manually
mapping each segment of the path.
Previously, if you had an SVG with a viewbox and a definite width
and height, then all SVGGeometryBox boxes within that SVG would
have a width and height set to the size of the parent SVG.
This broke hit testing for SVG paths, and didn't make much sense.
It seems like the SVG sizing hack was patching over the incorrect
logic in viewbox_scaling() and the incorrect path sizing (which was
never reached).
Before this change the view box scaling was:
element_dimension / viewbox_dimension
Which only seemed to work because of the SVG sizing hack that made
all paths the size of the containing SVG.
After this change SVGGeometryBoxes are (in most cases) sized correctly
based on their bounding boxes, which allows hit testing to function,
and the view box scaling is updated now to:
containing_SVG_dimension / viewbox_dimension
Which works with one less hack :^)
This now also handles centering the viewbox within the parent SVG
element and applying any tranforms to the bounding box. This still
a bit ad-hoc, but much more closely matches other browsers now.
This uses the new attribute parser functionality, and then resolves the
transform list into a single Gfx::AffineTransform.
This also adds a .get_transform() function which resolves the final
transform, by applying all parent transforms.
This parses SVG transforms using the syntax from CSS Transforms
Module Level 1. Note: This looks very similar to CSS tranforms, but
the syntax is not compatible. For example, SVG rotate() is
rotate(<a> <x> <y>) where all parameters are unitless numbers whereas
CSS rotate() is rotate(<angle> unit) along with separate rotateX/Y/Z().
(At the same time AttributeParser is updated to use GenericLexer which
makes for easier string matching).
There is work needed for error handling (which AttributeParser does not
deal with very gracefully right now).
This now applies clipping to the destination bounding box before
painting, which cuts out a load of clipped computation and allows
using the faster set_physical_pixel() method.
Alongside this it also combines the source_transform and
inverse_transform outside the hot loop (which might cut things down
a little).
The `destination_quad.contains()` check is also removed in favour
of just checking if the mapped point is inside the source rect,
which takes less work to compute than checking the bounding box.
This takes this method down from 98% of the time to 10% of the
time when painting Google Street View (with no obvious issues).
As noted in serval comments doing this goes against the WC3 spec,
and breaks parsing then re-serializing URLs that contain percent
encoded data, that was not encoded using the same character set as
the serializer.
For example, previously if you had a URL like:
https:://foo.com/what%2F%2F (the path is what + '//' percent encoded)
Creating URL("https:://foo.com/what%2F%2F").serialize() would return:
https://foo.com/what//
Which is incorrect and not the same as the URL we passed. This is
because the re-serializing uses the PercentEncodeSet::Path which
does not include '/'.
Only doing the percent encoding in the setters fixes this, which
is required to navigate to Google Street View (which includes a
percent encoded URL in its URL).
Seems to fix#13477 too
Previously, we would hit test positioned elements, then stacking
contexts with z-index 0, as two seperate steps. This did not really
follow the reverse paint order, where positioned elements and stacking
contexts with z-index 0 are painted during the same tree transversal.
This commit updates
for_each_in_subtree_of_type_within_same_stacking_context_in_reverse()
to return the stacking contexts it comes across too, but not recurse
into them. This more closely follows the paint order.
This fixes examples such as:
<div id="a" style="width: 10px; height: 10px">
<div id="b" style="position: absolute; width: 10px; height: 10px">
<div
style="position: absolute; width: 10px; height: 10px; z-index: 0"
>
<div id="c"
style="width: 100%; height: 100%; background-color:red;"
onclick="alert('You Win!')">
</div>
</div>
</div>
</div>
Where previously the onclick on #c would never fire as hit testing
always stopped at #b. This is reduced from Google Street View,
which becomes interactable after this commit.
JS::PrimitiveString::create uses `is_empty()` on DeprecatedString to
use the empty string cache on the VM. However, this also considers the
DeprecatedString null state to be empty, giving an empty string instead
of `null` for null DeprecatedStrings.
This proposal has been merged into the main ECMA-402 spec. See:
https://github.com/tc39/ecma402/commit/4257160
Note this includes some editorial and normative changes made when the
proposal was merged into the main spec, but are not in the proposal spec
itself. In particular, the following AOs were changed:
PartitionNumberRangePattern (normative)
SetNumberFormatDigitOptions (editorial)
This will examine the algorithm known as "the end" from the HTML
specification, which executes when parsing HTML markup has completed,
and it's potential to observably run script or change certain
attributes.
This currently executes in our engine when parsing HTML received from
the internet during navigation, using document.{open,write,close},
setting the innerHTML attribute or using DOMParser. The latter two are
only possible by executing script.
This has been causing some issues in our engine, which will be shown
later, so we are considering removing the call to "the end" for these
two cases.
Spoiler: the implications of running "the end" for DOMParser will be
considered in the future. It is the only script-created HTML/XML parser
remaining after this commit that uses "the end", including it's XML
variant implemented as XMLDocumentBuilder::document_end().
This will only focus on setting the innerHTML attribute, which falls
under "HTML fragment parsing", which starts here in the specification:
https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments44dd824764/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp (L3491)
While you may notice our HTMLParser::parse_html_fragment returns `void`
and assume this means no scripts are executed because of our use of
`WebIDL::ExceptionOr<T>` and `JS::ThrowCompletionOr<T>`, note that
dispatched events will execute arbitrary script via a callback, catch
any exceptions, report them and not propagate them. This means that
while a function does not return an exception type, it can still
potentially execute script.
A breakdown of the steps of "the end" in the context of HTML fragment
parsing and its observability follows:
https://html.spec.whatwg.org/multipage/parsing.html#the-end44dd824764/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp (L221)
1. No-op, as we don't currently have speculative HTML parsing. Even if
we did, we would instantly return after stopping the speculative
HTML parser anyway.
2. No-op, document.{open,write,close} are not accessible from the
temporary document.
3. No-op, document.readyState, window.navigation.timing and the
readystatechange event are not accessible from the created temporary
document.
4. This is presumably done so that reentrant invocation of the HTML
parser from document.{write,close} during the firing of the events
after step 4 ends up parsing from a clean state. This is a no-op, as
the events after step 4 do not fire and are not accessible.
5. No-op, we set HTMLScriptElement::m_already_started to true when
creating it whilst parsing an HTML fragment, which causes
HTMLScriptElement::prepare_script to instantly bail, meaning
`scripts_to_execute_when_parsing_has_finished` is always empty.
6. No-op, tasks are considered not runnable when the document does not
have a browsing context, which is always the case in fragment
parsing. Additionally, window.navigation.timing and the
DOMContentLoaded event aren't reachable from the temporary document.
7. Almost a no-op, `scripts_to_execute_as_soon_as_possible` is always
empty for the same reason as step 4. However, this step uses an
unconditional `spin_until` call, which _is_ observable and causes
one of the alluded to issues, which will be talked about later.
8. No-op, as delaying the load event has no purpose in this case, as
the task in step 9 will set the current document readiness to
"complete" and then return immediately after, as the temporary
document has no browsing context, skipping the Window load event.
However, this step causes another alluded to issue, which will be
talked about later.
9. No-op, for the same reason as step 6. Additionally,
document.readyState is not accessible from the temporary document
and the temporary document has no browsing context, so navigation
timing, the Window load event, the pageshow event, the Document load
event and the `<iframe>` load steps are not executed at all.
10. No-op, as this flag is only set from window.print(), which is not
accessible for this document.
11. No-op, as the temporary document is not accessible from anything
else and will be immediately destroyed after HTML fragment parsing.
Additionally, browsing context containers (`<iframe>`, `<frame>` and
`<object>`) cannot run in documents with no browsing context:
- `<iframe>` and `<frame>` use "create a new child navigable":
https://html.spec.whatwg.org/multipage/document-sequences.html#create-a-new-child-navigable44dd824764/Userland/Libraries/LibWeb/HTML/BrowsingContextContainer.cpp (L43-L45)
> 2. Let group be element's node document's browsing context's
top-level browsing context's group.
This requires the element's node document's browsing context to be
non-null, but it is always null with the temporary document created for
HTML fragment parsing.
This is protected against here for `<iframe>`:
https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-644dd824764/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp (L45)
> When an iframe element element is inserted into a document whose
browsing context is non-null, the user agent must run these steps:
1. Create a new child navigable for element.
This is currently not protected against for `<frame>` in the
specification:
https://html.spec.whatwg.org/multipage/obsolete.html#active-frame-element
> A frame element is said to be an active frame element when it is in a
document.
> When a frame element element is created as an active frame element,
or becomes an active frame element after not having been one, the
user agent must run these steps:
> 1. Create a new child navigable for element.
However, since this would cause a null dereference, this is actually a
specification issue. See: https://github.com/whatwg/html/issues/9136
- `<object>` uses "queue an element task" and has a browsing context
null check.
https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-object-element:queue-an-element-task44dd824764/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp (L58)44dd824764/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp (L105)
> ...the user agent must queue an element task on the DOM manipulation
task source given the object element to run the following steps to
(re)determine what the object element represents.
As established above, tasks are not runnable in documents with null
browsing contexts. However, for avoidance of doubt, it checks if the
document's browsing context is null, and if so, it falls back to
representing the element's children and gets rid of any child navigable
the `<object>` element may have.
> 2. If the element has an ancestor media element, or has an ancestor
object element that is not showing its fallback content, or if the
element is not in a document whose browsing context is non-null,
or if the element's node document is not fully active, or if the
element is still in the stack of open elements of an HTML parser
or XML parser, or if the element is not being rendered, then jump
to the step below labeled fallback.
> 4. Fallback: The object element represents the element's children.
This is the element's fallback content. Destroy a child navigable
given the element.
This check also protects against an `<object>` element being adopted
from a document which has a browsing context to one that doesn't during
the time between the element task being queued and then executed.
This means a browsing context container cannot be ran, meaning browsing
context containers cannot access their parent document and access the
properties and events mentioned in steps 1-11 above, or use
document.{open,write,close} on the parent document.
Another potential avenue of running script via HTML fragment parsing
is via custom elements being in the markup, which need to be
synchronously upgraded. For example:
```
<custom-element></custom-element>
```
However, this is already protected against in the spec:
https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token44dd824764/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp (L643)
> 7. If definition is non-null and the parser was not created as part
of the HTML fragment parsing algorithm, then let will execute
script be true. Otherwise, let it be false.
It is protected against overall by disabling custom elements via
returning `null` for all custom element definition lookups if the
document has no browsing context, which is the case for the temporary
document:
https://html.spec.whatwg.org/multipage/custom-elements.html#look-up-a-custom-element-definition44dd824764/Userland/Libraries/LibWeb/DOM/Document.cpp (L2106-L2108)
> 2. If document's browsing context is null, return null.
This is because the document doesn't have an associated Window, meaning
there will be no associated CustomElementRegistry object.
After running the HTML fragment parser, all of the child nodes are
removed the temporary document and then adopted into the context
element's node document. Skipping the `pre_remove` steps as they are
not relevant in this case, let's first examine Node::remove()'s
potential to execute script, then examine Document::adopt_node() after.
https://dom.spec.whatwg.org/#concept-node-remove44dd824764/Userland/Libraries/LibWeb/DOM/Node.cpp (L534)
1-7. Does not run any script, it just keeps a copy of some data that
will be needed later in the algorithm and directly modifies live
range attributes. However, since this relies on Range objects
containing the temporary document, the Range steps are no-ops.
8. Though this uses the temporary document, it does not contain any
NodeIterator objects as no script should have run, thus this
callback will not be entered. Even if the document _did_ have
associated NodeIterators, NodeIterator::run_pre_removing_steps does
not execute any script.
9-11. Does not run any script, it just keeps a copy of some data that
will be needed later in the algorithm and performs direct tree
mutation to remove the node from the node tree.
12-14. "assign slottables" and step 13 queue mutation observer
microtasks via "signal a slot change". However, since this is
done _after_ running "the end", the "spin the event loop" steps
in that algorithm does not affect this. Remember that queued
microtasks due not execute during this algorithm for the next
few steps.
Sidenote:
Microtasks are supposed to be executed when the JavaScript execution
context stack is empty. Since HTMLParser::parse_html_fragment is only
called from script, the stack will never be empty whilst it is running,
so microtasks will not run until some time after we exit this function.
15. This could potentially run script, let's have a look at the
removal steps we currently have implemented in our engine:
- HTMLIFrameElement::removed_from()
https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-744cf92616e/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp (L102)
Since browsing context containers cannot create child browsing
contexts (as shown above), this code will do nothing. This will also
hold true when we implement HTMLFrameElement::removed_from() in the
future.
- FormAssociatedElement::removed_from()
44cf92616e/Userland/Libraries/LibWeb/HTML/FormAssociatedElement.h (L36)
This calls `form_node_was_removed` which can then potentially call
`reset_form_owner`. However, `reset_form_owner` only does tree
traversal to find the appropriate form owner and does not execute
any script. After calling `form_node_was_removed` it then calls
`form_associated_element_was_removed`, which is a virtual function
that no one currently overrides, meaning no script is executed.
- HTMLBaseElement::removed_from()
44dd824764/Userland/Libraries/LibWeb/HTML/HTMLBaseElement.cpp (L45)
This will call `Document::update_base_element` to do tree traversal
to find out the new first `<base>` element with an href attribute and
thus does not execute any script.
- HTMLStyleElement::removed_from()
https://html.spec.whatwg.org/multipage/semantics.html#update-a-style-block44dd824764/Userland/Libraries/LibWeb/HTML/HTMLStyleElement.cpp (L49)
This will call `update_a_style_block`, which will parse the `<style>`
element's text content as CSS and create a style sheet from it. This
does not execute any script.
In summary, step 15 does not currently execute any script and ideally
shouldn't in the future when we implement more `removed_from` steps.
16. Does not run any script, just saves a copy of a variable.
17. Queues a "disconnectedCallback" custom elements callback. This will
execute script in the future, but not here.
18. Performs step 15 and 17 in combination for each of the node's
descendants. This will not execute any script.
19. Does not run any script, it performs a requirement of mutation
observers by adding certain things to a list.
20. Does not execute any script, as mutation observer callbacks are
done via microtasks.
21. This will not execute script, as the parent is always the temporary
document in HTML fragment parsing. There is no Document children
changed steps, so this step is a no-op.
We then do layout invalidation which is our own addition, but this also
does not execute any script.
In short, removing a node does not execute any script. It could execute
script in the future, but since this is done by tasks, it will not
execute until we are outside of HTMLParser::parse_html_fragment.
Let's look at adopting a node:
https://dom.spec.whatwg.org/#concept-node-adopt44dd824764/Userland/Libraries/LibWeb/DOM/Document.cpp (L1414)
1. Does not run script, it just keeps a reference to the temporary
document.
2. No-op, we removed the node above.
3.1. Does not execute script, it simply updates all descendants of
the removed node to be in the context element's node document.
3.2. Does not execute script, see node removal step 17.
3.3. This could potentially execute script, let's have a look at the
adopting steps we have implemented in our engine:
- HTMLTemplateElement::adopted_from()
https://html.spec.whatwg.org/multipage/scripting.html#the-template-element:concept-node-adopt-ext44dd824764/Userland/Libraries/LibWeb/HTML/HTMLTemplateElement.cpp (L38)
This simply adopts the `<template>` element's DocumentFragment node
into its inert document. This does not execute any script.
We then have our own addition of adopting NodeIterators over to the
context element's document, but this does not execute any script.
In short, adopting a node does not execute any script.
After adopting the nodes to the context element's document, HTML
fragment parsing is complete and the temporary document is no longer
accessible at all.
Document and element event handlers are also not accessible, even if
the event bubbles. This is simply because the temporary document is not
accessible, so tree traversal, IDL event handler attributes and
EventTarget#addEventListener are not accessible, on the document or any
descendants. Document is also not an Element, so element event handler
attributes do not apply.
In summary, this establishes that HTML fragment parsers should not run
any user script or internal C++ code that relies on things set up by
"the end". This means that the attributes set up and events fired by
"the end" are not observable in this case. This may have not explored
every single possible avenue, but the general assertion should still
hold. However, this assertion is violated by "the end" containing two
unconditional "spin the event loop" invocations and causes issues with
live web content, so we seek to avoid them.
As WebKit, Blink and Gecko have been able to get away with doing fast
path optimizations for HTML fragment parsing which don't setup
navigation timing, run events, etc. it is presumed we are able to get
away with not running "the end" for HTML fragment parsing as well.
WebKit: c69be377e1/Source/WebCore/dom/DocumentFragment.cpp (L90-L98)
Blink: 15444426f9/third_party/blink/renderer/core/editing/serializers/serialization.cc (L681-L702)
Gecko: 6fc2f6d533/dom/base/FragmentOrElement.cpp (L1991-L2002)
Removing the call to "the end" fixes at least a couple of issues:
- Inserting `<img>` elements via innerHTML causes us to spin forever.
This regressed in 2413de7e10
This is because `m_load_event_delayer.clear()` is performed inside an
element task callback. Because of the reasons stated above, this will
never execute. This caused us to spin forever on step 8 of "the end",
which is delaying the load event.
This affected Google Docs and Google Maps, never allowing them to
progress after performing this action. I have also seen it cause a
Scorecard Research `<img>` beacon in a `<noscript>` element inserted
via innerHTML to spin forever. This presumably affects many more
sites as well.
Given that the Window load event is not fired for HTML fragment
parsers, spinning the event loop to delay the load event does not
change anything, meaning this step can be skipped entirely.
- Microtask timing is messed up by the unconditional `spin_until`s on
steps 7 and 8.
"Spin the event loop" causes an unconditional microtask checkpoint:
https://html.spec.whatwg.org/multipage/webappapis.html#spin-the-event-loop44dd824764/Userland/Libraries/LibWeb/HTML/EventLoop/EventLoop.cpp (L54)
> 3. Let old stack be a copy of the JavaScript execution context
stack.
> 4. Empty the JavaScript execution context stack.
> 5. Perform a microtask checkpoint.
> 6.2.1. Replace the JavaScript execution context stack with old
stack.
This broke YouTube with the introduction of custom elements, as
custom elements use microtasks to upgrade elements and call
callbacks. See https://github.com/whatwg/html/issues/8646 for a full
example reduced from YouTube's JavaScript.
Another potential fix for this issue is to remove the above steps
from "spin the event loop". However, since we have another issue with
the use of "spin the event loop", it would be best to just avoid
both calls to it.
Considering all of the above, removing the call to "the end" is the way
forward for HTML fragment parsing, as all of it should be a no-op.
This is done by not simply returning from "the end" if the HTML parser
was created for HTML fragment parsing.
The end.
The layout node, and therefore the painting box, is frequently destroyed
and recreated. This causes us to forget the cached mouse position we use
to highlight media controls. Move this cached position to the DOM node
instead, which survives relayout.
The link color is what closely resembled the color I was going for on
the machine I originally developed the controls on. But turns out this
is a very dark blue on most Serenity themes. Instead, hard-code the
original intended color, which is a lighter blue.
We currently use Time::to_seconds() to report a video's duration. The
video, however, may have a sub-second duration. For example, the video
used by the video test page is 12.05 seconds long.
`vformat()` can now accept format specifiers of the form
{:'[numeric-type]}. This will output a number with a comma separator
every 3 digits.
For example:
`dbgln("{:'d}", 9999999);` will output 9,999,999.
Binary, octal and hexadecimal numbers can also use this feature, for
example:
`dbgln("{:'x}", 0xffffffff);` will output ff,fff,fff.
Rather than storing static DevicePixels dimensions, treat the desired
pixel sizes as CSSPixels and convert them to DevicePixels.
This was originally developed on a mac with a device-to-CSS-pixel ratio
of 2. Running it on another machine with a ratio of 1 made the controls
appear huge.
Use the new futimens syscall to ensure futimens can actually work.
This change for example allows a user to run "touch non-existing-file"
without getting any error, as expected.
This matches what the spec does, and consolidates all of the size-
related errors in one spot instead of distributing them throughout the
various uses of enqueue_value_with_size()
This made more sense in the beginning, but now AbstractOperations.h is
so large that there isn't much benefit. This will resolve some ugly
include order issues in the coming commits.
This has several advantages over the current manual demuxing currently
being performed. PlaybackManager hides the specific demuxer being used,
which will allow more codecs to be added transparently to LibWeb. It
also provides buffering and controls playback rate for us.
Further, it will allow us to much more easily implement the "media
timeline" to render a timestamp and implement seeking.