Non-recursive pseudo classes are easy to evaluate, so let's allow them
on the fast path.
Increases fast path coverage when loading our GitHub repo from 48% to
56% of all selectors evaluated.
If we determine that a selector is simple enough, we now run it using a
special matching loop that traverses up the DOM ancestor chain without
recursion.
The criteria for this fast path are:
- All combinators involved must be either descendant or child.
- Only tag name, class, ID and attribute selectors allowed.
It's definitely possible to increase the coverage of this fast path,
but this first version already provides a substantial reduction in time
spent evaluating selectors.
48% of the selectors evaluated when loading our GitHub repo are now
using this fast path.
18% speed-up on the "Descendant and child combinators" subtest of
StyleBench. :^)
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.
This change has two main benefits:
* Moving AK back more towards being an agnostic library that can
be used between the kernel and userspace. URL has never really fit
that description - and is not used in the kernel.
* URL _should_ depend on LibUnicode, as it needs punnycode support.
However, it's not really possible to do this inside of AK as it can't
depend on any external library. This change brings us a little closer
to being able to do that, but unfortunately we aren't there quite
yet, as the code generators depend on LibCore.
When matching a CSS attribute selector against an HTML element, the
attribute name is case-insensitive. Before this change, that meant we
had to call equals_ignoring_ascii_case() on all the attribute names.
We now cache the attribute name lowercased on each Attr node, which
allows us to do FlyString-to-FlyString comparison (simple pointer
comparison).
This brings attribute selector matching from 6% to <1% when loading our
GitHub repo at https://github.com/SerenityOS/serenity
If a short string is used for the attribute value, then the result of:
```cpp
auto const view = element.attribute(attribute_name).value_or({})
.bytes_as_string_view().split_view(' ');
```
is an array of string views pointing into a temporarily allocated
string.
With this change we keep string on stack until the end of scope.
Page that allows to reproduce the problem.
```html
<!DOCTYPE html><style>
div[data-info~="a"] {
background-color: yellow;
}
</style><div data-info="a">a</div>
```
No functional impact intended. This is just a more complicated way of
writing what we have now.
The goal of this commit is so that we are able to store the 'name' of a
pseudo element for use in serializing 'unknown -webkit-
pseudo-elements', see:
https://www.w3.org/TR/selectors-4/#compat
This is quite awkward, as in pretty much all cases just the selector
type enum is enough, but we will need to cache the name for serializing
these unknown selectors. I can't figure out any reason why we would need
this name anywhere else in the engine, so pretty much everywhere is
still just passing around this raw enum. But this change will allow us
to easily store the name inside of this new struct for when it is needed
for serialization, once those webkit unknown elements are supported by
our engine.
This commit removes DeprecatedString's "null" state, and replaces all
its users with one of the following:
- A normal, empty DeprecatedString
- Optional<DeprecatedString>
Note that null states of DeprecatedFlyString/StringView/etc are *not*
affected by this commit. However, DeprecatedString::empty() is now
considered equal to a null StringView.
Which pretty much needs to be done together due to the amount of places
where they are compared together.
This also involves porting over StackOfOpenElements over to FlyString
from DeprecatedFly string to prevent a gazillion calls to
`.to_deprecated_fly_string` calls in HTMLParser.
These apply to any elements that have some kind of open/closed state.
The spec suggests `<details>`, `<dialog>`, and `<select>`, so that's
what I've supported here. Only `<details>` is fleshed out right now,
but once the others are, these pseudo-classes should work
automatically. :^)
This should allow us to add a Element::attribute which returns an
Optional<String>. Eventually all callers should be ported to switch from
the DeprecatedString version, but in the meantime, this should allow us
to port some more IDL interfaces away from DeprecatedString.
This matches if the element has a placeholder, and that placeholder is
currently visible. This applies to `<input>` and `<textarea>` elements,
but our `<textarea>` is very limited so does not support placeholders.
As noted in a9620d8784 we don't currently
set the target element so this does not function, so no tests. But it
should work once we have a fleshed out Navigables implementation. :^)
We don't yet set the Document's target element in most cases, so this
does not function very well. But that will improve once we *do* set it,
which involves a more complete Navigables implementation.
For now, we parse these, but don't actually consider the namespace when
matching them. `DOM::Element` does not (yet) store attribute namespaces
so we can't check what they are.
Use FlyString::from_deprecated_fly_string() in these instances instead
of FlyString::from_utf8(). As we convert to new FlyString/String we want
to be aware of these potential unnecessary allocations.
This makes selector matching significantly faster by not forcing us to
convert from FlyString to DeprecatedFlyString when matching class
selectors. :^)
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
I thought the spec listing out the elements again was an oversight, but
it isn't, as simply inverting "is_actually_disabled" makes :enabled
apply to every element.
Previously we only considered an element disabled if it was an <input>
element with the disabled attribute, but there's way more elements that
apply with more nuanced disabled/enabled rules.
When matching selectors in HTML documents, we know that all the elements
have lowercase local names already (the parser makes sure of this.)
Style sheets still need to remember the original name strings, in case
we want to match against non-HTML content like XML/SVG. To make the
common HTML case faster, we now cache a lowercase version of the name
with each type/class/id SimpleSelector.
This makes tag type checks O(1) instead of O(n).