CMYK data describes which inks a printer should use to print a color.
If a screen should display a color that's supposed to look similar
to what the printer produces, it results in a color very different
to what Color::from_cmyk() produces. (It's also printer-dependent.)
There are many ICC profiles describing printing processes. It doesn't
matter too much which one we use -- most of them look somewhat
similar, and they all look dramatically better than Color::from_cmyk().
This patch adds a function to download a zip file that Adobe offers
on their web site. They even have a page for redistribution:
https://www.adobe.com/support/downloads/iccprofiles/icc_eula_win_dist.html
(That one leads to a broken download though, so this downloads the
end-user version.)
In case we have to move off this download at some point, there are also
a whole bunch of profiles at https://www.color.org/registry/index.xalter
that "may be used, embedded, exchanged, and shared without restriction".
The adobe zip contains a whole bunch of other useful and fun profiles,
so I went with it.
For now, this only unzips the USWebCoatedSWOP.icc file though, and
installs it in ${CMAKE_BINARY_DIR}/Root/res/icc/Adobe/CMYK/. In
Serenity builds, this will make it to /res/icc/Adobe/CMYK in the
disk image. And in lagom build, after #23016 this is the
lagom res staging directory that tools can install via
Core::ResourceImplementation. `pdf` and `MacPDF` already do that,
`TestPDF` now does it too.
The final piece is that LibPDF then loads the profile from there
and uses it for DeviceCMYK color conversions.
(Doing file access from the bowels of a library is a bit weird,
especially in a system that has sandboxing built in. But LibGfx does
that in FontDatabase too already, and LibPDF uses that, so it's not a
new problem.)
All ColorSpace subclasses converted to float anyways, and this
allows us to save lots of float->Value->float conversions during
image color space processing.
A bit faster:
```
N Min Max Median Avg Stddev
x 50 0.99054313 1.0412271 0.99933481 1.0052408 0.012931916
+ 50 0.97073889 1.0075941 0.97849107 0.98184034 0.0090329046
Difference at 95.0% confidence
-0.0234004 +/- 0.00442595
-2.32785% +/- 0.440287%
(Student's t, pooled s = 0.0111541)
```
Microoptimization; according to ministat a bit faster:
```
N Min Max Median Avg Stddev
x 50 1.0179932 1.0561159 1.0315337 1.0333617 0.0094757426
+ 50 1.000875 1.0427601 1.0208509 1.0201902 0.01066116
Difference at 95.0% confidence
-0.0131715 +/- 0.00400208
-1.27463% +/- 0.387287%
(Student's t, pooled s = 0.0100859)
```
Reduces time spent rendering page 3 of 0000849.pdf from 1.32s to 1.13s
on my machine.
Also reduces the time to run Meta/test_pdf.py on 0000.zip
(without 0000849.pdf) from 56s to 54s.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
Fixes these errors from `Meta/test_pdf.py path/to/0000`, with
0000 being 0000.zip from the PDF/A corpus in unzipped:
Malformed PDF file: Indexed color space lookup table doesn't
match size, in 4 files, on 8 pages, 73 times
path/to/0000/0000206.pdf 2 4 (2x) 5 (3x) 6 (4x)
path/to/0000/0000364.pdf 5 6
path/to/0000/0000918.pdf 5
path/to/0000/0000683.pdf 8
Previously, we were scaling palette indices from 0..(palette_size - 1)
to 0..255 before using them as index into the palette. Instead, do not
scale palette indices before using them as indices.
(Renderer::load_image() uses `component_value_decoders.empend(
.0f, 255.0f, dmin, dmax)`, so to get an identity mapping, we have to
return `0, 255` from IndexedColorSpace::default_decode()).
Fixes rendering of the gradient on page 5 of 0000277.pdf.
Gfx::ICC::Profile's current API takes bytes, so we need to do some
contortions for LAB values to go through.
This will probably become nicer once we implement all the backward
transforms in Gfx::ICC::Profile, but for now let's hack it in
on the LibPDF side.
Makes colors in 0000651.pdf looks good, especially on pages 1 and 7-12.
If one profile uses PCSXYZ and the other PCSLAB as connection space,
we now do the necessary XYZ/LAB conversion.
With this and the previous commits, we can now convert from profiles
that use PCSLAB with mAB, such as stress.jpeg from
https://littlecms.com/blog/2020/09/09/browser-check/ :
% Build/lagom/icc --name sRGB --reencode-to serenity-sRGB.icc
% Build/lagom/bin/image -o out.png \
--convert-to-color-profile serenity-sRGB.icc \
~/src/jpegfiles/stress.jpeg
For valid PDFs, this makes no difference.
For invalid PDFs, we now assert during the cast in resolve_to() instead
of returning a PDFError. However, most PDFs are valid, and even for
invalid PDFs, we'd previously keep the old color space around when
getting the PDF error and then usually assert later when the old
color space got passed a color with an unexpected number of components
(since the components were for the new color space).
Doesn't affect any of the > 2000 PDFs I use for testing locally,
is less code, and should make for less surprising asserts when it
does happen.
Namely, for CalGrayColorSpace, CalRGBColorSpace, LabColorSpace.
Fixes a crash rendering any page of Adobe's 5014.CIDFont_Spec.pdf
(which uses CalRGBColorSpace with an indirect dict: The dict is
object `92 0`, and many color spaces are inline objects referring
to it).
* Compare array size to 3 and 4, not 4 and 5
* Fix literal typo in error message
Fixes crash processing 0000906.pdf from 0000.zip from the pdf/a dataset.
We did convert from the input space to linear space and then
to linear sRGB, but we forgot to re-apply gamma.
This uses the x^2.2 curve instead of the real sRGB curve for now.
CalRGBColorSpace::color() converts into a flat xyz space,
which already takes input whitepoint into account.
It shouldn't be taken into account again when converting from
the flat color space to D65.
Documents using them now show render-time diagnostics instead
of asserting that number of parameters passed to a color don't
match whatever number of channels the previously-set color space
had.
Fixes two asserts on the `-n 500` 0000.zip test set.
Same as other recent color spaces: Enough to make us not assert,
but not enough to actually produce color.
Fixes 2 asserts on the `-n 500` 0000.zip pdfa dataset.
Actually using separation color spaces still doesn't work, but we
now no longer assert on them when they're used.
Fixes 2 crashes on the `-n 500` 0000.zip pdfa dataset.
Per spec:
"If the color space is one that can be specified by a name and no
additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, and certain
cases of Pattern), the name may be specified directly."
We still don't implement /Pattern color spaces, but now we no longer
crash trying to look up the potentially-nonexistent /ColorSpace
dictionary on the page object when /Pattern is used directly as color
space name.
On top of #21514, reduces number of crashes on 300 random PDFs from the
web (the first 300 from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 42 (14%) to 34 (11%).
It used to be called ColorSpaceFamily::never_needs_parameters().
But in the cpp file, the macro arg was called ever_needs_parameters,
and the spec says
"If the color space is one that can be specified by a name and no
additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, and certain
cases of Pattern), the name may be specified directly."
so let's use that language here.
No behavior change.
We now no longer crash on images that use an ICC-based color space.
Reduces number of crashes on 300 random PDFs from the web (the first 300
from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 81 (27%) to 64 (21%).
Also fixes all remaining crashes in
411_getting_started_with_instruments.pdf and
513_high_efficiency_image_file_format.pdf.
None of the methods actually do anything, but we now create an
actual SeparationColorSpace object for /Separation color spaces.
This fixes a crash on page 810 of pdf_reference_1-7.pdf.
Previously, we'd log a "separation color space not supported" error,
which would lead to Renderer not updating its current color space.
It'd stay a DeviceCYMK color space, which would then later assert
when it got a 1-argument array as color (which now the
SeparationColorSpace gets instead, which logs an "unimplemented"
error for that instead of asserting).
That way, we render an incomplete page and log a message instead of
crashing the viewer application.
Lets us survive e.g. page 489 of pdf_reference_1-7.pdf.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
ColorSpaces now can tell users how many components they expect, and the
default decode array that should be used when converting unit bit
sequences into color space component input values during image
rendering.
ColorSpaces can be specified in two ways: with a stream as operands of
the color space operations (CS/cs), or as a separate PDF object, which
is then referred to by other means (e.g., from Image XObjects and other
entities). These two modes of addressing a ColorSpace are slightly
different and need to be addressed separately. However, the current
implementation embedded the full logic of the first case in the routine
that created ColorSpace objects.
This commit refactors the creation of ColorSpace to support both cases.
First, a new ColorSpaceFamily class encapsulates the static aspects of a
family, like its name or whether color space construction never requires
parameters. Then we define the supported ColorSpaceFamily objects.
On top of this also sit a breakage on how ColorSpaces are created. Two
methods are now offered: one only providing construction of no-argument
color spaces (and thus taking a simple name), and another taking an
ArrayObject, hence used to create ColorSpaces requiring arguments.
Finally, on top of *that* two ways to get a color space in the Renderer
are made available: the first creates a ColorSpace with a name and a
Resources dictionary, and another takes an Object. These model the two
addressing modes described above.