Instead, just rely on the invalidation and lazy relayout that happens as
a consequence of mutating the DOM.
This allows multiple keystrokes to coalesce into a single relayout if
necessary, dramatically improving performance when typing text into
form fields on complex pages.
Before this change, we were not detaching paintables from DOM nodes
within shadow subtrees.
This appears to be the main reason that keyboard editing was doing
immediate forced relayout: doing a full layout invalidation meant we'd
build a new layout tree, which then hid the problem with with
still-attached paintables.
By detaching them before committing a new layout, we make it possible
for keyboard editing to just use normal relayout, instead of full forced
invalidation & relayout.
CharacterData nodes and their subclasses (most commonly Text) don't have
style, as style is specific to Elements. So there's no need to mark them
for a style update when their content is programmatically changed.
Previously, we returned from the value setter if the specified value
was above the max value. This is not required, as the getter clamps the
returned value to the max value.
Rather than make path segments virtual and refcounted let's store
`Gfx::Path`s as a list of `FloatPoints` and a separate list of commands.
This reduces the size of paths, for example, a `MoveTo` goes from 24
bytes to 9 bytes (one point + a single byte command), and removes a
layer of indirection when accessing segments. A nice little bonus is
transforming a path can now be done by applying the transform to all
points in the path (without looking at the commands).
Alongside this there's been a few minor API changes:
- `path.segments()` has been removed
* All current uses could be replaced by a new `path.is_empty()` API
* There's also now an iterator for looping over `Gfx::Path` segments
- `path.add_path(other_path)` has been removed
* This was a duplicate of `path.append_path(other_path)`
- `path.ensure_subpath(point)` has been removed
* Had one use and is equivalent to an `is_empty()` check + `move_to()`
- `path.close()` and `path.close_all_subpaths()` assume an implicit
`moveto 0,0` if there's no `moveto` at the start of a path (for
consistency with `path.segmentize_path()`).
Only the last point could change behaviour (though in LibWeb/SVGs all
paths start with a `moveto` as per the spec, it's only possible to
construct a path without a starting `moveto` via LibGfx APIs).
Text can be rendered in various ways in PDFs: Filled, stroked,
both filled and stroked, set as clipping path, hidden, or
some combinations thereof.
We don't implement any of this at the moment except "filled".
Hidden text is used in scanned documents: The image of the scan is
drawn in the background, and then OCRd text is "drawn" as hidden
on top of the scanned bitmap. That way, the (hidden) text can be
selected and copied, and it looks like you're selecting text from
the scanned bitmap. Find-in-page also works similarly. (We currently
have neither text selection nor find-in-page, but one day we will.)
Now that we have pretty good support for CCITT and are growing some
support for JBIG2, we now draw both the scanned background image
as well as the foreground text. They're not always perfectly aligned.
This change makes it so that we don't render text that's marked as
hidden. (We still do most of the coordinate math, which will probably
come in handy at some point when we implement text selection.)
This makes these scanned documents appear as they're supposed to
appear (at least in documents where we manage to decode the background
bitmap).
This also adds a debug option to force rendering of hidden text.
Before this change, we would wake up on every event loop iteration to
drive animations in single-frame images. This was a complete waste of
time and caused 100% CPU usage on our main GitHub repo page.
With this change, CPU usage is ~1% when idle on the same page. :^)
This commit introduces a WEB_SET_PROTOTYPE_FOR_INTERFACE macro that
caches the interface name in a local static FlyString. This means that
we only pay for FlyString-from-literal lookup once per browser lifetime
instead of every time the interface is instantiated.
Document::navigable() can be unpleasantly slow, since we don't have a
direct link between documents and navigables at the moment. So let's not
call it twice when once is enough.
This allows us to skip evaluating selectors like "[foo=bar]" for any
element that doesn't have a "foo" attribute.
Note that the bucket is case-insensitively keyed on the attribute name
since case sensitivity is depending on evaluation context. This ensures
we may get some false positives but no false negatives.
Reduces the number of selectors evaluated by 36% when loading our GitHub
repo at https://github.com/SerenityOS/serenity
We already grow the "rules to run" vector before appending to it, so we
can actually use unchecked_append() here and avoid the "needs to grow"
checks every time we append to it.
This takes appending from 3% to <1% when loading our GitHub repo.
When matching a CSS attribute selector against an HTML element, the
attribute name is case-insensitive. Before this change, that meant we
had to call equals_ignoring_ascii_case() on all the attribute names.
We now cache the attribute name lowercased on each Attr node, which
allows us to do FlyString-to-FlyString comparison (simple pointer
comparison).
This brings attribute selector matching from 6% to <1% when loading our
GitHub repo at https://github.com/SerenityOS/serenity
It seems to do the right thing already, and nothing in the spec says
not to do this as far as I can tell.
With this, we can finally decode
Tests/LibGfx/test-inputs/jbig2/bitmap.jbig2 and add a test for
decoding simple arithmetic-coded images.
This errors out on many special cases. None of those seem to be hit
in practice (with the exception of TPGDON, which is used in a handful
PDFs. I have an implementation of that locally, but I'll put that
in a separate PR. The code for it is straightforward, but adding a
test for it is a bit involved.)
With this, we can decode about half of the JBIG2 images in my PDF
test dataset.
In practice, everything uses white backgrounds and operators `or`
or `xor` to turn them black, at least for the simple images we're
about to be able to decode.
To make sure we don't forget implementing this for real once needed,
reject other ops, and also reject black backgrounds (because 1 | 0
is 1, not 0 like our overwrite implementation will produce).
This means we have to remove a test, but since this scenario doesn't
seem to happen in practice, that seems ok.
The context can vary for every bit we read.
This does not affect the one use in the test which reuses the same
context for all bits, but it is necessary for future changes.
The API value of a <textarea> element is its raw value with normalized
newlines. This should be used in a couple of places where we currently
use the raw value.
Patch up existing style properties instead of using the regular style
invalidation path, which requires rule matching for each element in the
invalidated subtree.
- !important properties: this change introduces a flag used to skip the
update of animated properties overridden by !important.
- inherited animated properties: for now, these are invalidated by
traversing animated element's subtree to propagate the update.
- StyleProperties has a separate array for animated properties that
allows the removal animated properties after animation has ended,
without requiring full style invalidation.
Rather than adding a bunch of `get_*_from_mime_type` functions, add just
one to get the Core::MimeType instance. We will need multiple fields at
once in Browser.
I think the context normally changes for every bit. But this here
is enough to correctly decode the test bitstream in Annex H.2 in
the spec, which seems like a good checkpoint.
The internals of the decoder use spec naming, to make the code
look virtually identical to what's in the spec. (Even so, I managed
to put in several typos that took a while to track down.)
The behavior of Crypto::UnsignedBigInt::export_data unexpectedly
does not actually remove leading zero bytes when the corresponding
parameter is passed. The caller must manually adjust for the location
of the zero bytes.
EXTTEMPLATE=1 was added later and doesn't seem to be used much in
practice -- it doesn't appear in no simple generic regions in any PDF
I tested so far at least. Since the spec contradicts itself on what
to do with these as far as I can tell, error out on them for now and
then add support once we find actual files using this, so that we can
check our implementation actually works.
Deduplicate the data reading for the different cases, and
zero-initialize all adaptive template pixels to zero to make that
possible.
Other than prohibiting EXTTEMPLATE=1, no behavior change.
By following the spec more closely, we can actually make this function
a bit more efficient (by comparing the parent against the document
instead of looking for the first element child of the document).
If a selector must match a pseudo element, or must match the root
element, we now cache that information in the MatchingRule struct.
We also introduce separate buckets for these rules, so we can avoid
running them altogether if the current element can't possibly match.
This cuts the number of selectors evaluated by 32% when loading our
GitHub repo page https://github.com/SerenityOS/serenity
We frequently end up matching hundreds or even thousands of rules. By
giving this vector some inline capacity, we avoid a lot of the
repetitive churn from dynamically growing it all the way from 0
capacity.
This is required to upload files to GitHub. Unfortunately, this is not
currently testable with our test infrastructure. This path is only hit
from HTTP/S uploads, whereas all of our tests are limited to file://.
We were unconditionally creating new File objects for all Blob-type
values passed to `FormData.append`. We should only do so if the value is
not already a File object (or if the `filename` attribute is present).
We must also carry MIME type information forward from the underlying
Blob object.
This does not implement any of the IDL methods, but GitHub requires the
interface exists to upload files via an <input type="file"> element.
Their JS handles uploads via this element and via drag-and-drop in one
function, and check if the uploaded file is `instanceof DataTransfer` to
decide how to handle it.
The memmem() call passes `data.size() - 19 - sizeof(u32)` for big_len,
(18 prefix bytes skipped, the flag byte, and the trailing u32), so the
buffer needs to be at least that large.
Should fix https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=67332
Either we mount from a loop device or other source, the user might want
to obfuscate the given source for security reasons, so this option will
ensure this will happen.
If passed during a mount, the source will be hidden when reading from
the /sys/kernel/df node.
This patch implements and tests window.crypto.sublte.generateKey with
an RSA-OAEP algorithm. In order for the types to be happy, the
KeyAlgorithms objects are moved to their own .h/.cpp pair, and the new
KeyAlgorithms for RSA are added there.
This patch throws away some of the spec suggestions for how to implement
the normalize_algorithm AO and uses a new pattern that we can actually
extend in our C++.
Also update CryptoKey to store the key data.
This causes a behavior change in which the read FD is now non-blocking.
This is intentional, as this change avoids a deadlock between RS and
WebContent, where WC could block while reading from the request FD,
while RS is blocked sending a message to WC.
The underlying issue here isn't quite understood yet, but for some
reason, when we defer this connection, we ultimately end up blocking
indefinitely on macOS when a subsequent StartRequest message tries to
send its request FD over IPC. We should continue investigating that
issue, but for now, this lets us use RequestServer more reliably on
macOS.
This was copying the vector behind our backs, let's remove it and make
the copying explicit by putting it behind COWVector::mutable_at().
This is a further 64% performance improvement on Wasm validation.
These vector copies accounted for more than 50% of the current runtime
of the validator on a large wasm file, this commit makes them
copy-on-write to avoid the copies where possible, gaining nearly a 50%
speedup.
When inserting a node into a parent, any live DOM ranges that reference
the parent may need to be updated. The spec does this by increasing or
decreasing the start/end offsets of each live range *before* actually
performing the insertion.
This caused us to crash with a verification failure, since it was
possible to set the range offset to an invalid value (that would go on
to immediately become valid after the insertion was finished).
This patch fixes the issue by adding special badged helpers on Range for
Node to reach into it and increase/decrease the offsets during node
insertion. This skips the offset validity check and actually makes our
code read slightly more like the spec.
Found by Domato :^)
According to the WebIDL specification, any leading underscores in
identifier names are ignored. A leading underscore is used when an
identifier would otherwise be a reserved word.
Rather than try to lay out masks normally, this updates the TreeBuilder
to create layout nodes for masks as a child of their user (i.e. the
masked element). This allows each use of a mask to be laid out
differently, which makes supporting `maskContentUnits=objectBoundingBox`
fairly easy.
The `SVGFormattingContext` is then updated to lay out masks last (as
their sizing may depend on their parent), and treats them like
viewports.
This is pretty ad-hoc, but the SVG specification does not give any
guidance on how to actually implement this.
When navigating to about:srcdoc we try to populate the session history
by calling populate_session_history_entry_document, if the resource is
an about:srcdoc this method will rely on the document_resource in the
SessionHistoryEntry to have a String in order for it call the right
method to create navigation params.
However when navigating to about:srcdoc directly, document_resource
will have an Empty, this leads populate_session_history_entry_document
to call the wrong method to create navigation params.
This fixes the issue by populating document_resource with an empty
string if it has an Empty and we're dealing with an about:srcdoc.
Issue: #23216
Instead of crashing with a TODO() on half of the test cases generated by
Domato, let's just return a zeroed-out SVGAnimatedLength or
SVGAnimatedNumber from getters that return them.
We'll eventually have to implement these correctly, but crashing is not
productive since it blocks us from finding other issues.
The loop that was supposed to check the chain of previous or next
siblings had a logic mistake where it would never traverse the chain,
so we would get stuck looking at the immediate sibling forever.
After removing an iframe from the DOM, its contentWindow will be
detached from its browsing context, per spec.
Because the contentWindow is still accessible, we cannot assume that
Window objects always have an associated browsing context.
This needs to be fixed in the spec, but let's add a sensible null check
in the meantime.
Since SVG gradients can reference each other, we have to keep track of
visited gradients when traversing the link chain, or we will recurse
infinitely when there's a reference cycle.
Instead of creating a generic Layout::Box, make a BlockContainer. This
allows them to be laid out by BFC, which is better than nothing(?),
even if it's not going to be correct at all.
Normally, assigning to e.g document.body.onload will forward to
window.onload. However, in a detached DOM tree, there is no associated
window, so we have nowhere to forward to, making this a no-op.
The bulk of this change is making Document::window() return a nullable
pointer, as documents created by DOMParser or DOMImplementation do not
have an associated window object, and so must be able to return null
from here.
That's not actually a DOM invariant, just something the HTML parser
refuses to build. You can still construct table-less th and td elements
using the DOM API.
If the "MMR" bit is set, the generic region decoding procedure just
uses ITU-T T.6 (2D CCITT), which we already have an implementation of.
In practice, this is used almost never in .jbig2 files and in none of
the PDFs I have.
The two files that do use MMR are:
1.) JBIG2_ConformanceData-A20180829/F01_200_TT9.jb2
2) 042_3.jb2 from the ghostpdl tests
The first uses an immediate _lossless_ generic region, which we don't
implement yet (but I think it should just forward to the normal
immediate generic region code? Not in this commit, though). The second
uses a regular immediate generic region, and we can decode it now:
Build/lagom/bin/image -o out.png \
path/to/ghostpdl/tests/jbig2/042_3.jb2
While IMO, the change makes sense on its own as flattening this function
will just duplicate the code from `draw_glyph()` with no benefits, this
is not what motivated this patch.
When compiling with debug information and ASAN, GCC 13.2 would issue
this warning:
Userland/Libraries/LibGfx/Painter.cpp: In member function ‘void Gfx::Pai
nter::draw_glyph(Gfx::FloatPoint, u32, Gfx::Color)’:
Userland/Libraries/LibGfx/Painter.cpp:1354:14: note: variable tracking s
ize limit exceeded with ‘-fvar-tracking-assignments’, retrying without
1354 | FLATTEN void Painter::draw_glyph(FloatPoint point, u32 code_poin
t, Color color)
| ^~~~~~~
From what I've read online, this is caused by some limit on the number
of symbols in the compiler's internal data structures. People at Google
have fixed this warning by splitting functions:
https://codereview.chromium.org/1164893003
While getting us rid of the warning, it also drastically improves
compilation time. Going from 1min27 to 59s on my machine.
Performance handles the document origin time correctly, and prevents
these times from being unusually large. Also initialize the
DocumentTimeline time in the constructor, since these can be created
from JS.
With this, `image` can convert any jbig2 file, as long as it's
black (or white), and LibPDF can draw jbig2 files (again, as long
as they only contain a single color stored in just a
PageInformation segment).
Also make scan_for_page_size() not early return, so that it has the
same behavior as the main decoding look. (Multiple page information
segments for a single page are likely invalid and don't happen in
practice, so this is mostly an academic change.)
Add a BitBuffer class to store the bit image data, and introduce a
Page struct for storing data associated with a page. We currently
only handle a single page, but a) this makes it easier to decode
multiple pages in the future if we want b) it makes the code easier
to understand.
7.4.8.2 Page bitmap height:
"In some cases, this value may not be known at the time that the page
information segment is written. In this case, this field must contain
0xFFFFFFFF, and the actual page height may be communicated later, once
it is known."
Sounds like the spec guarantees that that's the number of the first
page.
(In practice, all but one of all jbig2 files I've found contain just
page 1. PDFs almost always contain just page 1, and very rarely a
page 0 for globally shared parameters.)
If the provided ID is smaller than the corner clipping vector, we would
shrink the vector to match. This causes a crash when we have nested
PaintContext instances (as these IDs are allocated by PaintContext),
each of which perform radius painting.
This is seen on https://www.strava.com/login when it loads a reCAPTCHA.
The outer div has a border radius, which contains the reCAPTCHA in an
iframe. That iframe contains an SVG which also has a border radius.
Now, if an element belongs to a shadow tree, we use only the style
sheets from the corresponding shadow root during style computation,
instead of using all available style sheets as was the case
previously.
The only exception is the user agent style sheets, which are still
taken into account for all elements.
Tests/LibWeb/Layout/input/input-element-with-display-inline.html
is affected because style of document no longer affects shadow tree
of input element, like it is supposed to be.
Co-authored-by: Simon Wanner <simon+git@skyrising.xyz>
Doing that will allow us to get a list of style sheets for each shadow
root from StyleComputer without having to traverse the entire tree in
upcoming changes.
If a style element belongs to a shadow tree, its CSSStyleSheet is now
added to the corresponding ShadowRoot instead of the document.
Co-authored-by: Simon Wanner <simon+git@skyrising.xyz>
This patch also makes FlexFormattingContext::calculate_static_position
use computed values for margins and borders, since this function may be
called before the box's state has been finalized.
This allows `file` to correctly print the dimensions of a .jbig2 file,
and it allows us to write a test that covers much of all the code
written so far.
Several ramifications:
* /JBIG2Globals is an indirect reference, which means we now need
a Document for unfiltering. (Technically, other decode parameters
can also be indirect objects and we should use the Document to
resolve() those too, but in practice it only seems to be needed
for /JBIG2Globals.)
* Since /JBIG2Globals are so rare, we just parse once for each
image that use them, and decode_embedded() now receives a
Vector<ReadonlyBytes> with all sections of sequences of
segments.
* Internally, decode_segment_headers() is now called several times
for embedded JBIG2s with multiple such sections (e.g. PDFs with
/JBIG2Globals).
* That means `data` is now no longer part of JBIG2LoadingContext
and things get slightly reshuffled due to this.
This completes the LibPDF part of JBIG2 support. Once LibGfx
implements actual decoding of JBIG2s, things should start to
Just Work in PDFs.
They're in different places for Sequential/Embedded (right after
the header) and RandomAccess (which has all headers first, followed
by all data bits next).
We don't do anything with the data yet, but now everything's in
place to actually process segment data.
Except for /JBIG2Globals, which we bail out on for now. In my 1000
files, 13 use JBIG2, and of those, 2 use JBIG2Globals (0000372.pdf e.g.
page 11 and 0000857.pdf e.g. page 1), and only one (the latter) of the
two uses the same JBIG2Globals stream for more than a single image.
JBIG2ImageDecoderPlugin cannot decode the data yet, so no behavior
change, but with `#define JBIG2_DEBUG 1` at the top of that file,
it now prints segment header info for PDFs containing JBIG2 data :^)
With `#define JBIG2_DEBUG 1` at the top of the file:
% Build/lagom/bin/image --no-output \
.../JBIG2_ConformanceData-A20180829/F01_200_TT10.jb2
JBIG2LoadingContext: Organization: 0 (Sequential)
JBIG2LoadingContext: Number of pages: 1
Segment number: 0
Segment type: 48
Referred to segment count: 0
Segment page association: 1
Segment data length: 19
Segment number: 1
Segment type: 39
Referred to segment count: 0
Segment page association: 1
Segment data length: 12666
Segment number: 2
Segment type: 49
Referred to segment count: 0
Segment page association: 1
Segment data length: 0
Runtime error: JBIG2ImageDecoderPlugin: Draw the rest of the owl
With `#define JBIG2_DEBUG 1` at the top of the file:
% Build/lagom/bin/image --no-output \
.../JBIG2_ConformanceData-A20180829/F01_200_TT10.jb2
JBIG2LoadingContext: Organization: 0 (Sequential)
JBIG2LoadingContext: Number of pages: 1
Segment number: 0
Segment type: 48
Referred to segment count: 0
Segment page association: 1
Segment data length: 19
Runtime error: JBIG2ImageDecoderPlugin: Draw the rest of the owl
This is closer to what the spec instructs us to do, and matches how
associations are maintained in the timelines. Also note that the removed
destructor logic is not necessary since we visit the associated
animations anyways.
We create labels in the Browser from text on web documents, which may
use carriage returns. LibGfx will split lines on CR already, but LibGUI
would not consider CRs when computing the height of the containing
widget. The result was text that would not fit in the label's box.
Fixes pages 17-19 on
https://www.iro.umontreal.ca/~feeley/papers/ChevalierBoisvertFeeleyECOOP15.pdf
Calling the fill handler after painting the stroke as previously doesn't
work, since we need to set up the clip before both stroke and fill, and
unset it after both. The duplication is a bit unfortunate, but also
minor.
ObservableArray inherits from JS::Array and overrides `internal_set`
and `internal_delete` to run an interceptor callback when an indexed
item is added or deleted.
This is for validating that a decoder with a weak or nonexistent
sniff() method thinks it can decode an image. This should not be
treated as an error.
No behavior change.
We are currently using Core::DateTime, which is meant to represent local
time. However, we are doing no conversion between the parsed time in UTC
and local time, so we end up comparing time stamps from different time
zones.
Instead, store the parsed times as UnixDateTime, which is UTC. Then we
can always compare the parsed times against the current UTC time.
This also lets us store parsed milliseconds.
Displaying a GML preview in HackStudio seems to be broken at the moment,
but this change will be needed once it does work again, so might as well
make it now, while I'm aware of the issue.