Commit graph

61 commits

Author SHA1 Message Date
Nico Weber
3b616b6af8 LibPDF: Use original error for failing ICC load 2024-02-21 13:37:08 +01:00
Nico Weber
9c762b9650 LibPDF+Meta: Use a CMYK ICC profile to convert CMYK to RGB
CMYK data describes which inks a printer should use to print a color.
If a screen should display a color that's supposed to look similar
to what the printer produces, it results in a color very different
to what Color::from_cmyk() produces. (It's also printer-dependent.)

There are many ICC profiles describing printing processes. It doesn't
matter too much which one we use -- most of them look somewhat
similar, and they all look dramatically better than Color::from_cmyk().

This patch adds a function to download a zip file that Adobe offers
on their web site. They even have a page for redistribution:
https://www.adobe.com/support/downloads/iccprofiles/icc_eula_win_dist.html

(That one leads to a broken download though, so this downloads the
end-user version.)

In case we have to move off this download at some point, there are also
a whole bunch of profiles at https://www.color.org/registry/index.xalter
that "may be used, embedded, exchanged, and shared without restriction".

The adobe zip contains a whole bunch of other useful and fun profiles,
so I went with it.

For now, this only unzips the USWebCoatedSWOP.icc file though, and
installs it in ${CMAKE_BINARY_DIR}/Root/res/icc/Adobe/CMYK/. In
Serenity builds, this will make it to /res/icc/Adobe/CMYK in the
disk image. And in lagom build, after #23016 this is the
lagom res staging directory that tools can install via
Core::ResourceImplementation. `pdf` and `MacPDF` already do that,
`TestPDF` now does it too.

The final piece is that LibPDF then loads the profile from there
and uses it for DeviceCMYK color conversions.

(Doing file access from the bowels of a library is a bit weird,
especially in a system that has sandboxing built in. But LibGfx does
that in FontDatabase too already, and LibPDF uses that, so it's not a
new problem.)
2024-02-01 13:42:04 -07:00
Nico Weber
f840fb6b4e LibPDF: Make DeviceCMYKColorSpace::the() fallible
No behavior change.
2024-02-01 13:42:04 -07:00
Nico Weber
5f85aff036 LibPDF: Move ColorSpace::style() to take ReadonlySpan<float>
All ColorSpace subclasses converted to float anyways, and this
allows us to save lots of float->Value->float conversions during
image color space processing.

A bit faster:

```
    N           Min           Max        Median         Avg       Stddev
x  50    0.99054313     1.0412271    0.99933481   1.0052408  0.012931916
+  50    0.97073889     1.0075941    0.97849107  0.98184034 0.0090329046
Difference at 95.0% confidence
	-0.0234004 +/- 0.00442595
	-2.32785% +/- 0.440287%
	(Student's t, pooled s = 0.0111541)
```
2024-01-12 12:37:56 +00:00
Nico Weber
56a4af8d03 LibPDF: Don't reallocate Vectors in ICCBasedColorSpace all the time
Microoptimization; according to ministat a bit faster:

```
    N           Min           Max        Median         Avg       Stddev
x  50     1.0179932     1.0561159     1.0315337   1.0333617 0.0094757426
+  50      1.000875     1.0427601     1.0208509   1.0201902   0.01066116
Difference at 95.0% confidence
	-0.0131715 +/- 0.00400208
	-1.27463% +/- 0.387287%
	(Student's t, pooled s = 0.0100859)
```
2024-01-12 12:37:56 +00:00
Nico Weber
cfd05b1a55 LibPDF: Use MatrixMatrixConversion when possible
Reduces time spent rendering page 3 of 0000849.pdf from 1.32s to 1.13s
on my machine.

Also reduces the time to run Meta/test_pdf.py on 0000.zip
(without 0000849.pdf) from 56s to 54s.
2024-01-12 09:09:56 +01:00
Nico Weber
c161b2d2f9 LibPDF: Extract ICCBasedColorSpace::sRGB() helper 2024-01-12 09:09:56 +01:00
Nico Weber
6032c06f6b Revert "LibPDF: Add basic tiled, coloured pattern rendering"
This reverts commit 8ff87911a3.
2023-12-21 19:24:56 +01:00
Nico Weber
d577d181e3 LibPDF: Clamp linear_srgb values in convert_to_srgb()
This is very crude gamut mapping, but it's better than producing
NaNs when passing negative values to powf(x, 1/2.2).
2023-12-20 12:45:07 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Kyle Pereira
8ff87911a3 LibPDF: Add basic tiled, coloured pattern rendering 2023-12-10 16:44:24 +01:00
Kyle Pereira
60c4803dd3 LibPDF: Pass Renderer to ColorSpace 2023-12-10 16:44:24 +01:00
Kyle Pereira
082a4197b6 LibPDF: Use Variant<Color, PaintStyle> instead of Color for ColorSpaces
This is in anticipation of Pattern color space support which does not
yield a simple color.
2023-12-10 16:44:24 +01:00
Nico Weber
8b50b689f9 LibPDF: Reject invalid "hival" values
Doesn't fire on any of the PDFs I have, and seems like a good thing
to check.
2023-12-07 08:10:40 +00:00
Nico Weber
43cd3d7dbd LibPDF: Tolerate palettes that are one byte too long
Fixes these errors from `Meta/test_pdf.py path/to/0000`, with
0000 being 0000.zip from the PDF/A corpus in unzipped:

    Malformed PDF file: Indexed color space lookup table doesn't
                        match size, in 4 files, on 8 pages, 73 times
      path/to/0000/0000206.pdf 2 4 (2x) 5 (3x) 6 (4x)
      path/to/0000/0000364.pdf 5 6
      path/to/0000/0000918.pdf 5
      path/to/0000/0000683.pdf 8
2023-12-07 08:10:40 +00:00
Nico Weber
8733ba2734 LibPDF: Fix decoding of IndexedColorSpace for palette sizes != 255
Previously, we were scaling palette indices from 0..(palette_size - 1)
to 0..255 before using them as index into the palette. Instead, do not
scale palette indices before using them as indices.

(Renderer::load_image() uses `component_value_decoders.empend(
.0f, 255.0f, dmin, dmax)`, so to get an identity mapping, we have to
return `0, 255` from IndexedColorSpace::default_decode()).

Fixes rendering of the gradient on page 5 of 0000277.pdf.
2023-12-06 15:32:13 +01:00
Nico Weber
4cb0593daf LibPDF: Convert LAB values to bytes differently
Gfx::ICC::Profile's current API takes bytes, so we need to do some
contortions for LAB values to go through.

This will probably become nicer once we implement all the backward
transforms in Gfx::ICC::Profile, but for now let's hack it in
on the LibPDF side.

Makes colors in 0000651.pdf looks good, especially on pages 1 and 7-12.
2023-12-05 11:36:44 -05:00
Nico Weber
b2a1130556 LibGfx/ICC: Implement conversion between different connection spaces
If one profile uses PCSXYZ and the other PCSLAB as connection space,
we now do the necessary XYZ/LAB conversion.

With this and the previous commits, we can now convert from profiles
that use PCSLAB with mAB, such as stress.jpeg from
https://littlecms.com/blog/2020/09/09/browser-check/ :

    % Build/lagom/icc --name sRGB --reencode-to serenity-sRGB.icc
    % Build/lagom/bin/image -o out.png \
        --convert-to-color-profile serenity-sRGB.icc \
        ~/src/jpegfiles/stress.jpeg
2023-12-04 08:02:36 +00:00
Nico Weber
f882a3ae37 LibPDF: In ColorSpace creation code, use resolve_to() more
For valid PDFs, this makes no difference.

For invalid PDFs, we now assert during the cast in resolve_to() instead
of returning a PDFError. However, most PDFs are valid, and even for
invalid PDFs, we'd previously keep the old color space around when
getting the PDF error and then usually assert later when the old
color space got passed a color with an unexpected number of components
(since the components were for the new color space).

Doesn't affect any of the > 2000 PDFs I use for testing locally,
is less code, and should make for less surprising asserts when it
does happen.
2023-11-13 10:29:26 -05:00
Nico Weber
bbde3cbc90 LibPDF: Tolerate an indirect object as dict for CIE-based color spaces
Namely, for CalGrayColorSpace, CalRGBColorSpace, LabColorSpace.

Fixes a crash rendering any page of Adobe's 5014.CIDFont_Spec.pdf
(which uses CalRGBColorSpace with an indirect dict: The dict is
object `92 0`, and many color spaces are inline objects referring
to it).
2023-11-13 07:12:05 -05:00
Nico Weber
5af6e1c042 LibPDF: Implement DeviceNColorSpace 2023-11-09 23:33:49 +01:00
Nico Weber
0f07049935 LibPDF: Add ColorSpaceFamily::operator==
No behavior change.
2023-11-09 23:33:49 +01:00
Nico Weber
b78ea81de5 LibPDF: Implement SeparationColorSpace
Requires PDF::Function, which isn't implemented yet, so this has
no visual effect yet.
2023-11-06 10:01:05 +01:00
Nico Weber
21894f1cde LibPDF: Fix typos in DeviceN colorspace scaffolding
* Compare array size to 3 and 4, not 4 and 5
* Fix literal typo in error message

Fixes crash processing 0000906.pdf from 0000.zip from the pdf/a dataset.
2023-11-06 09:54:01 +01:00
Nico Weber
30ea218e35 LibPDF: Implement IndexedColorSpace 2023-11-05 14:27:22 -07:00
Nico Weber
3dca11c4e2 LibPDF: Move color space creation from name or array into ColorSpace
No behavior change.
2023-11-05 14:27:22 -07:00
Nico Weber
1dfd49ef99 LibPDF: Implement LabColorSpace 2023-11-05 14:27:22 -07:00
Nico Weber
4a5136fc8c LibPDF: Implement CalGrayColorSpace
I haven't seen this being used in the wild, but it's used in
Tests/LibPDF/colorspaces.pdf.
2023-11-04 17:02:37 -04:00
Nico Weber
a207ab709a LibPDF: In convert_to_srgb(), also apply sRGB curve (ish)
We did convert from the input space to linear space and then
to linear sRGB, but we forgot to re-apply gamma.

This uses the x^2.2 curve instead of the real sRGB curve for now.
2023-11-04 17:02:37 -04:00
Nico Weber
641365b235 LibPDF: Move colorspace conversion functions up a bit
No code change, no behavior change. Pure code move.
2023-11-04 17:02:37 -04:00
Nico Weber
f8799885de LibPDF: Clamp sRGB channels before converting to u8 in CalRGB code
Sometimes the numbers end up just slightly above 1.0f, which previously
caused an overflow.
2023-11-01 11:45:13 -04:00
Nico Weber
bdd2404453 LibPDF: Ignore input whitepoint in convert_to_d65()
CalRGBColorSpace::color() converts into a flat xyz space,
which already takes input whitepoint into account.

It shouldn't be taken into account again when converting from
the flat color space to D65.
2023-11-01 11:45:13 -04:00
Nico Weber
e35a5da2fb LibPDF: Update dead link in a comment 2023-11-01 11:45:13 -04:00
Nico Weber
1fcf0142d2 LibPDF: Fix unfortunate typo in CalRGBColorSpace::create()
We always ignored the /Matrix key in /CalRGB dicts.
2023-11-01 11:45:13 -04:00
Tim Ledbetter
b4296e1c9b LibPDF: Don't use unsanitized values in error messages
Previously, constructing error messages with unsanitized input could
fail because error message strings must be UTF-8.
2023-10-26 11:05:32 +02:00
Nico Weber
f8bf9c6506 LibPDF: Sketch out DeviceN color spaces a bit
Documents using them now show render-time diagnostics instead
of asserting that number of parameters passed to a color don't
match whatever number of channels the previously-set color space
had.

Fixes two asserts on the `-n 500` 0000.zip test set.
2023-10-26 11:05:00 +02:00
Nico Weber
2878af5968 LibPDF: Sketch out Lab color space
Same as other recent color spaces: Enough to make us not assert,
but not enough to actually produce color.

Fixes 2 asserts on the `-n 500` 0000.zip pdfa dataset.
2023-10-26 10:59:45 +02:00
Nico Weber
311cc7d9b9 LibPDF: Implement two SeparationColorSpace methods
Actually using separation color spaces still doesn't work, but we
now no longer assert on them when they're used.

Fixes 2 crashes on the `-n 500` 0000.zip pdfa dataset.
2023-10-25 05:52:47 +02:00
Nico Weber
78dea9500f LibPDF: Make operator parsing use ReadonlySpan instead of Vector
No behavior change.
2023-10-20 14:24:31 -04:00
Nico Weber
e0268dcc87 LibPDF: Allow /Pattern to be used directly as a color space name
Per spec:

"If the color space is one that can be specified by a name and no
additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, and certain
cases of Pattern), the name may be specified directly."

We still don't implement /Pattern color spaces, but now we no longer
crash trying to look up the potentially-nonexistent /ColorSpace
dictionary on the page object when /Pattern is used directly as color
space name.

On top of #21514, reduces number of crashes on 300 random PDFs from the
web (the first 300 from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 42 (14%) to 34 (11%).
2023-10-20 10:35:54 -06:00
Nico Weber
aea0e2f313 LibPDF: Rename ColorSpaceFamily function to may_be_specified_directly()
It used to be called ColorSpaceFamily::never_needs_parameters().

But in the cpp file, the macro arg was called ever_needs_parameters,
and the spec says

"If the color space is one that can be specified by a name and no
additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, and certain
cases of Pattern), the name may be specified directly."

so let's use that language here.

No behavior change.
2023-10-20 10:35:54 -06:00
Nico Weber
33443f7991 LibPDF: Implement ICCBasedColorSpace::number_of_components()
We now no longer crash on images that use an ICC-based color space.
Reduces number of crashes on 300 random PDFs from the web (the first 300
from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/)
from 81 (27%) to 64 (21%).

Also fixes all remaining crashes in
411_getting_started_with_instruments.pdf and
513_high_efficiency_image_file_format.pdf.
2023-10-20 08:58:52 +02:00
Nico Weber
5aab31dc40 LibPDF: Dedicated messages for Indexed and Pattern spaces
Makes them easier to interpret in `pdf --debugging-stats` output.
2023-07-24 11:01:25 -04:00
Nico Weber
fad834a21c LibPDF: Add smoke-and-mirror implementation of SeparationColorSpace
None of the methods actually do anything, but we now create an
actual SeparationColorSpace object for /Separation color spaces.

This fixes a crash on page 810 of pdf_reference_1-7.pdf.
Previously, we'd log a "separation color space not supported" error,
which would lead to Renderer not updating its current color space.
It'd stay a DeviceCYMK color space, which would then later assert
when it got a 1-argument array as color (which now the
SeparationColorSpace gets instead, which logs an "unimplemented"
error for that instead of asserting).
2023-07-24 09:52:01 -04:00
Nico Weber
7b825fb44b LibPDF: Replace two TODO()s with Error returns
That way, we render an incomplete page and log a message instead of
crashing the viewer application.

Lets us survive e.g. page 489 of pdf_reference_1-7.pdf.
2023-07-23 11:42:44 -04:00
Matthew Olsson
9a0e1dde42 LibPDF: Propogate errors from ColorSpace::color() 2023-07-20 06:56:41 +01:00
Matthew Olsson
e989008471 LibPDF: Use proper ICC profiles for ICCBasedColorSpace 2023-07-20 06:56:41 +01:00
Timothy Flynn
f3db548a3d AK+Everywhere: Rename FlyString to DeprecatedFlyString
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
2023-01-09 23:00:24 +00:00
Rodrigo Tobar
26f8c0b76c LibPDF: Add more knowledge to ColorSpaces classes
ColorSpaces now can tell users how many components they expect, and the
default decode array that should be used when converting unit bit
sequences into color space component input values during image
rendering.
2022-12-10 10:49:03 +01:00
Rodrigo Tobar
ba16310739 LibPDF: Refactor parsing of ColorSpaces
ColorSpaces can be specified in two ways: with a stream as operands of
the color space operations (CS/cs), or as a separate PDF object, which
is then referred to by other means (e.g., from Image XObjects and other
entities). These two modes of addressing a ColorSpace are slightly
different and need to be addressed separately. However, the current
implementation embedded the full logic of the first case in the routine
that created ColorSpace objects.

This commit refactors the creation of ColorSpace to support both cases.
First, a new ColorSpaceFamily class encapsulates the static aspects of a
family, like its name or whether color space construction never requires
parameters. Then we define the supported ColorSpaceFamily objects.

On top of this also sit a breakage on how ColorSpaces are created. Two
methods are now offered: one only providing construction of no-argument
color spaces (and thus taking a simple name), and another taking an
ArrayObject, hence used to create ColorSpaces requiring arguments.

Finally, on top of *that* two ways to get a color space in the Renderer
are made available: the first creates a ColorSpace with a name and a
Resources dictionary, and another takes an Object. These model the two
addressing modes described above.
2022-12-10 10:49:03 +01:00