Rewrites the grid area building to accurately identify areas that span
multiple rows. Also now we can recognize invalid areas but do not
handle them yet.
With the recording painter the actual painting operations are delayed,
so now if multiple corner clippers are constructed, and they use a
shared bitmap they can interfere with each other. The use of this shared
bitmap was somewhat questionable anyway, so this is not much of a loss.
This fixes the border-radius.html test page.
If a PDF uses `/CustomName cs` and `/CustomName` then points at just a
name like `/DeviceGray` instead of an array, that's ok. Just using
`/DeviceGray cs` is simpler, so this extra level of indirection is
somewhat rare in practice, but it's valid and it does happen. So support
it.
We already have a helper that does the right thing that we just need to
call.
Together with #21524 and #21525, reduces number of crashes on 300 random
PDFs from the web (the first 300 from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 29 (9%) to 25 (8%).
This fixes a small bug from 39b2eed3f6: That commit tried to disable
filters for the very first object read, for the case covered in
Tests/LibPDF/password-is-sup.pdf.
However, it accidentally also disabled filters by default.
Most of the time, this isn't really a difference: We call
`set_filters_enabled(true);` very early in
`DocumentParser::initialize_linearization_dict()`, which explicitly
enables filters, and `initialize_linearization_dict()` is the very
first thing called in `DocumentParser::initialize()`.
But there's an early exit in `initialize_linearization_dict()`
for if there's nothing looking like an indirect object right
after the header, and in this case we used to not enable
filtering, and would hand compressed streams to the operand parser.
(And due to a 2nd bug, we'd even do this if the header line was
followed by an empty line.)
0000990.pdf from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/
starts like so:
```
%PDF-1.7
4 0 obj
```
parse_heaader() used to put the cursor at the start of the 2nd,
empty, line. initialize_linearization_dict() would then check
if `m_reader.matches_number()` to see if there could possibly
be a linearization dict.
In this case, there isn't one, but we should detect linearization
dicts even if they're separated by whitespace from the first line.
Grid items should respect alignment properties if top/right/bottom/left
are not specified.
This change adds a separate implementation of
layout_absolutely_positioned_element that is extended with support for
alignment.
If the first pass of rows sizing results in the container's automatic
height being less than the specified min-height, we need to run a
second pass using the updated available space.
Per spec:
"If the color space is one that can be specified by a name and no
additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, and certain
cases of Pattern), the name may be specified directly."
We still don't implement /Pattern color spaces, but now we no longer
crash trying to look up the potentially-nonexistent /ColorSpace
dictionary on the page object when /Pattern is used directly as color
space name.
On top of #21514, reduces number of crashes on 300 random PDFs from the
web (the first 300 from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 42 (14%) to 34 (11%).
It used to be called ColorSpaceFamily::never_needs_parameters().
But in the cpp file, the macro arg was called ever_needs_parameters,
and the spec says
"If the color space is one that can be specified by a name and no
additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, and certain
cases of Pattern), the name may be specified directly."
so let's use that language here.
No behavior change.
We now no longer crash on images that use an ICC-based color space.
Reduces number of crashes on 300 random PDFs from the web (the first 300
from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 81 (27%) to 64 (21%).
Also fixes all remaining crashes in
411_getting_started_with_instruments.pdf and
513_high_efficiency_image_file_format.pdf.
This is meant to serve as the method all Ladybird chromes can use to
highlight the eTLD+1 substring of the URL. It uses the Public Suffix
List to break the URL into 3 parts: the scheme and subdomain, the
eTLD+1, and all remaining parts (port, path, query, etc.).
Fixes two places in child navigable destroy procedure where we used
content navigable instead of container's navigable.
With this change, iframe's nested histories are actually destroyed
along with the document that created them.
Implements the `ri` operator, and the `RI` key in a graphics state
dictionary.
We don't do anything yet with the color rendering intent except
store it.
No behavior change except removing a few "not yet implemented"
messages.
Follow-up to #21489. There, I made us use a RAII object.
That's great, but if the embedded instruction stream pushes
its own graphics state, then an early return would cause us to
not process graphics state pop instructions in the embedded stream.
To fix this, remember the graphics stack depth before entering
the nested instruction stream, and explicitly shrink the stack back
to that size upon exit.
Enables us to render all pages of
https://devstreaming-cdn.apple.com/videos/wwdc/2017/821kjtggolzxsv/821/821_get_started_with_display_p3.pdf
without crashing.
BMP files encode the direction of the rows with the sign of the height.
Our BMP decoder already makes all the proper checks, however when
constructing the Gfx::Bitmap, didn't actually make the height positive.
Boog neutralized :^)
This change separates the box outer shadow metrics calculations into a
separate function. This function is then used to obtain the shadow
bounding rectangle and skip painting if the entire shadow is outside
of the viewport.
Previously, if one operator returned an error, the TRY() would cause
us to return without restoring the outer graphics state, leading to
problems such as handing a 3-tuple to a grayscale color space
(because the inner object set up a grayscale color space that we
failed to dispose of).
Makes us crash later on page 43 of
https://devstreaming-cdn.apple.com/videos/wwdc/2017/821kjtggolzxsv/821/821_get_started_with_display_p3.pdf
The spec asks us to perform some calculations that quickly exceed an
`u64`, but instead of jumping through hoops we can rely on our AK
implementation of floating point formatting to come up with the
correctly rounded result.
Note that most other JS engines seem to diverge from the spec as well
and fall back to a generic dtoa path.
Font programs are bytecode programs defining glyphs. If several glyphs
share a piece of outline, that opcode sequence can be put in a
subroutine ("subr") table and the definition of those glyphs can then
call that subroutine by number, to reduce file size.
CFF fonts can in theory contain multiple fonts, and so there's a global
subr table shared by all the fonts in one CFF, and a local per-fornt
subr table. We used to only implement the local subr table, now we
implement both.
(We only support one font per CFF, and at least in PDF files, that's
all that's ever used. So a global subr table isn't very useful.
But the spec explicitly allows it -- "Global subroutines may be used in
a FontSet even if it only contains one font." -- and it happens in
practice.)
CFF::parse_index_data() calls move_to() to put the reader's
current position behind the index data.
In several PDFs, the PrivDictOperator::Subrs case in CFF::create()
sets up a span that contains exactly the Subrs data and nothing
after it, so that finale move_to() call in parse_index_data()
would cause an assert.
This is similar to fe3612ebcb, where the caller was also in CFF.
So maybe CFF just has a different view of what valid values to pass
to Reader are, compared to the rest of the code? But having an iterator
point to one past the valid data in a container is common, so maybe
this is the Right Fix after all.
Fixes a crash opening 411_getting_started_with_instruments.pdf
(and a whole bunch of other WWDC slides). Rendering is pretty glitchy
and we still crash on page 14, but at least we can open the file now.
The file is currently available at:
https://devstreaming-cdn.apple.com/videos/wwdc/2019/411cbc60y12x68arcof/411/411_getting_started_with_instruments.pdf
Outline items can contain either a /Dest key or an /A key.
The /Dest key points to a "Destination" (various ways to reference a
page in the same document).
The /A key points to an "Action" which can have several types.
One type, the /GoTo type, just also points to a Destination.
Implement GoTo actions. This makes clicking "Contents" in the outline of
https://developer.apple.com/library/archive/documentation/mac/pdf/Text.pdf
work. (Almost all other items in this file's outline use /Dest.
"Contents" could too, but it uses /A /GoTo for some reason.)
(Other action types are things like opening a hyperlink, opening a
different file, playing a sound, submitting a form, etc. Actions
are also used for in-page links, not just in outlines. Many of
these action types we'll likely never want to implement.)
This was the last piece of data we didn't read yet.
(We also don't yet support multiple fonts per CFF, but I haven't
found a PDF using that yet.)
We still don't do anything with it, but now we at least print a
warning if this data is there and we ignore it.
https://adobe-type-tools.github.io/font-tech-notes/pdfs/T1_SPEC.pdf :
"Using charstring subroutines is not a requirement of a Type 1
font program."
And some versions of Computer Modern do in fact not contain a Subrs
array.
Together with #21473, makes Problemset.pdf from the pdffiles repro
render ok instead of crashing.
This modification introduces a new layer to the painting process. The
stacking context traversal no longer immediately calls the
Gfx::Painter methods. Instead, it writes serialized painting commands
into newly introduced RecordingPainter. Created list of commands is
executed later to produce resulting bitmap.
Producing painting command list will make it easier to add new
optimizations:
- It's simpler to check if the painting result is not visible in the
viewport at the command level rather than during stacking context
traversal.
- Run painting in a separate thread. The painting thread can process
serialized painting commands, while the main thread can work on the
next paintable tree and safely invalidate the previous one.
- As we consider GPU-accelerated painting support, it would be easier
to back each painting command rather than constructing an alternative
for the entire Gfx::Painter API.
This change addresses the bug where images unable to load when the
reload button in the UI is clicked repeatedly. Before this fix, it was
possible to use SharedImageRequests across multiple documents. However,
when the document that initiated the request is gone, tasks scheduled
on the event loop remain in the fetching state because the originating
document is no longer active. Furthermore, another reason to prohibit
the sharing of image requests across documents is that the "Origin"
header in an image request is dependent on the document.
Previously VERIFY et al. was redefined inside tests to not abort and
instead fail the test. This wouldn't apply to non-header code though,
and was not helpful, as it prevented you from easily attaching gdb near
the abort.
After this removal tests can still use the EXPECT family of macros, but
VERIFY will behave like it does in the rest of the codebase (abort
etc.).
With this, all tables from the spec appendixes are in CFF.cpp.
This fixes a crash reading page 2 (and onward) of
2ThestructureoftheCIE1997ColourAppearanceModelCIECAM97s.pdf in
the pdffiles repo.
The encoding offset defaults to 0, i.e. the Standard Encoding.
That means reading the encoding only if the tag is present causes
us to not read it if a font uses the Standard Encoding.
Now, we always read an encoding, even if it's the (implicit) default
one.
The main encoding data maps glyph ID ("GID") to its codepoint.
If a glyph has several codepoints, then a secondary table mapping
codepoint to string ID ("SID") of the glyph's name is present.
(A separate table associates each glyph with its name already.)
I haven't seen this used in the wild, but the structure of the
supplemental data is also going to be needed for built-in encodings.
After d2c7e1ea7d, there is now only one
user of LibPublicSuffix - the URL sanitation utility within LibWebView.
Rather than having an entire library for the small Public Suffix data
accessor, merge it into LibWebView.
Previously, all input elements were given a textbox-like style by
default, this was then undone by another CSS rule in the case of certain
types of input element. This commit makes it so that the first rule
simply ignores those types instead.
Co-authored-by: Sam Atkins <atkinssj@serenityos.org>
If the user runs /usr/lib/Loader.so and pass a second string to argv, we
will try to run it as if it was running that program without invoking
the dynamic loader explicitly.
Two bugs:
1. We decoded a u32, not an i32 as the spec wants
2. (minor) Our fixed-point divisor was off by one
Fixes text rendering in Bakke2010a.pdf in pdffiles, and rendering of
other fonts with negative width adjustments from optcode 255.
That PDF was produced by "Apple pstopdf" and uses font SFBX1200,
which is apparently a variant of Computer Modern. So maybe this
helps with lots of PDFs produced from TeX files, but I haven't
checked that.
a396bb0 removed the palette field but did not update the allocation size
in `Bitmap::serialize_to_byte_buffer()`. This led to a few crashes (I
noticed this from a drag/drop crash in the file manager).
Fixes#21434
Previously, the null state of m_root_path was use to (subtly) mark the
parent of the root. The empty path is always replaced with "." so after
aeee98b there was no "parent of root" node. This lead to the file
manager crashing when opened.
I haven't seen this being used in the wild (yet), but it's easy
to implement, and with this we support all charset formats.
So we can now mention if we see a format we don't know about.
On my machine, benchmarking 3DFileViewer revealed ~2.5% of CPU time
spent in `Vector<GPU::Vertex>::try_append`. By carefully managing list
capacities, we can remove this method from profiles altogether.
Optimize a very hot function by always performing unchecked appends.
When benchmarking 3DFileViewer on my machine, this takes the time spent
in `gl_vertex` down from ~8% to ~2%.
LibSoftGPU used to calculate the normal transformation based on the
model view transformation for every primitive, because that's when we
sent over the matrix. By making LibGL a bit smarter and only update the
matrices when they could have changed, we only need to calculate the
normal transformation once on every matrix update.
When viewing `Tuba.obj` in 3DFileViewer, this brings the percentage of
time spent in `FloatMatrix4x4::inverse()` down from 15% to 0%. :^)
We were confusing the time spent rendering with the time spent rendering
_and_ waiting for the next frame, which is important since we render
frames on a timer.
From "10 String INDEX":
"Further space saving is obtained by allocating commonly occurring
strings to predefined SIDs. These strings, known as the standard
strings, describe all the names used in the ISOAdobe and Expert
character sets along with a few other strings common to Type 1 fonts. A
complete list of standard strings is given in Appendix A. The client
program will contain an array of standard strings with nStoStrings
elements. Thus, the standard strings take SIDs in the range 0 to
(nStaStrings-1)."
And "13 Charsets" says that charsets store SIDs.
Fixes all
"Couldn't find string for SID $n, going with space"
messages when going through the encoding pages (page 1010 and
thereabouts) in the PDF 1.7 spec.
Only really useful for reading SIDs in the Top DICT (copyright
text etc), which we currently don't do.
I haven't seen a difference from looking things up in the string
table. The only real effect from the commit that I need is that
it pulls a local resolve() labmda into a real function
resolve_sid(), which I want to call in a future commit.
But it makes things more spec-compliant, and if we ever want to
read SIDs in metadata in the future, now we can.
The UnicodeData header cannot be included by any file other than .cpp
files within LibUnicode itself. Outside users cannot assume the header
will exist, as it will not be generated if the CMake option to do so is
disabled (ENABLE_UNICODE_DATABASE_DOWNLOAD).
We now produce a `matrix3d()` value when appropriate.
Some sites (such as gsap.com) request the resolved style for `transform`
when there's no viewport paintable, but the element itself does already
have a stacking context. This fixes crashes in that case, because we now
do not access the stacking context at all.
We also do not wrap the result as a StyleValueList any more. The
returned StyleValue is only serialized and exposed to JS, so making it a
StyleValueList has no effect.
As noted, there are two situations where an element will have no layout
node here:
1. The element is invisible in a way that it generates no layout node.
2. We haven't built the layout yet.
This protects against the second case, which would otherwise incorrectly
send us down the path of looking directly at the computed style.
That API came from a mistake in the IDL compiler, where reflected
nullable attributes would try to call set_attribute(name, null).
This commit fixes the mistake in the IDL generator, and removes the
meaningless API.
We currently implement several forms of this method across the Ladybird
chromes. As such, we see commits to add special URL handling that only
affects a single chrome. Instead, let's consolidate all special handling
in a single location for all chromes to make use of.
This method can handle resolving file:// URLs, falling back to a search
engine query, and validation against the Public Suffix List. These cases
were gathered from the various chromes.
This makes the parser more resilient to invalid IMAP messages.
Usages of `Optional` have also been removed where the empty case is
equivalent to an empty object.
This commit removes DeprecatedString's "null" state, and replaces all
its users with one of the following:
- A normal, empty DeprecatedString
- Optional<DeprecatedString>
Note that null states of DeprecatedFlyString/StringView/etc are *not*
affected by this commit. However, DeprecatedString::empty() is now
considered equal to a null StringView.
We'd unconditionally get the int from a Variant<int, float> here,
but PDFs often have a float for defaultWidthX and nominalWidthX.
Fixes crash opening Bakke2010a.pdf from pdffiles (but while the
file loads ok, it looks completely busted).
We already set these variables and call `_init` in the dynamic linker.
As we don't care about static binaries, remove these assignments and the
call to `_init` from `_entry`.
The function referenced by DT_INIT is also not necessarily called
`_init`, so directly calling `_init` is not really correct.
`s_global_initializers_ran` and `__stack_chk_guard` are unused, so
remove them.
We currently don't call any DT_FINI_ARRAY functions, so change that.
The call to `_fini` in `exit` is unnecessary, as we now call the
function referenced by DT_FINI in `__call_fini_functions`.
We currently store a StringView into the DeprecatedString provided to
SVGUseElement::attribute_changed. This is a temporary string created by
String::to_deprecated_string, so this StringView is always a dangling
pointer.
Instead, since this string value is an ID and is primarily used as a
FlyString, store it as a FlyString from the get-go.
This was the only remaining codec that produced IndexedN bitmaps.
By removing them, we'll be able to get rid of those formats and simplify
the Bitmap and Painter classes.
Instead of resolving lengths used in the backdrop-filter during
painting, we can do that earlier in apply_style().
This change moves us a bit closer to the point when the stacking
context tree will be completely separated from the layout tree :)
Previously, every time a page switched fonts, we'd completely
re-parse the font.
Now, we cache fonts in Renderer, effectively caching them per page.
It'd be nice to have an LRU cache across pages too, but that's a
bigger change, and this already helps a lot.
Font size is part of the cache key, which means we re-parse the same
font at different font sizes. That could be better too, but again,
it's a big help as-is already.
Takes rendering the 1310 pages of the PDF 1.7 reference with
Build/lagom/bin/pdf --debugging-stats \
~/Downloads/pdf_reference_1-7.pdf
from 71 s to 11s :^)
Going through pages especially in the index is noticeably snappier.
(On the PDF 2.0 spec, ISO_32000-2-2020_sponsored.pdf, it's less
dramatic: From 19s to 16s.)
This allows the decoder to fail gracefully when reading a partial or
malformed TBSCertificate. We also now ensure that the certificate data
is valid before making a copy of it.