Commit graph

89 commits

Author SHA1 Message Date
Nico Weber
730876fda9 LibGfx/JPEG: Add a comment to inverse_dct_8x8()
See here:
https://github.com/SerenityOS/serenity/issues/22739#issuecomment-1890599116

No behavior change.
2024-03-23 09:40:29 +01:00
Lucas CHOLLET
fb81668d8f LibGfx/JPEGLoader: Check earlier for quantization tables presence
This patch brings few small QoL improvements:
 - We don't need to read the Huffman stream before returning an error
   due to a missing quantization table.
 - We check the table presence only once per scan instead of once per
   MCU.
 - `dequantize()` is now infallible.
2024-02-26 20:13:25 +00:00
Nico Weber
607880cbd3 LibGfx/JPEGLoader: Add dbgln_if() when hitting unsupported marker 2024-02-21 17:54:53 +01:00
Nico Weber
95391fafcb LibGfx/JPEGLoader: Print offset in an error dbgln() in hex 2024-02-21 17:54:53 +01:00
Nico Weber
24a469f521 Everywhere: Prefer {:#x} over 0x{:x} in format strings
The former automatically adapts the prefix to binary and octal
output, and is what we already use in the majority of cases.

Patch generated by:

    rg -l '0x\{' | xargs sed -i '' -e 's/0x{:/{:#/'

I ran it 4 times (until it stopped changing things) since each
invocation only converted one instance per line.

No behavior change.
2024-02-21 17:54:38 +01:00
Nico Weber
d7f04c9aa1 LibGfx/JPEGLoader: Make byte_offset() return offset from start of stream
JPEGStream::byte_offset() now returns an offset relative to the start
of the stream, instead of relative to the buffered part.

No behavior change except if JPEG_DEBUG is set.
2024-02-08 07:45:34 -07:00
Nico Weber
69964e10f4 LibGfx+Tests: Improve calculation of restart interval
JPEGs can store a `restart_interval`, which controls how many
minimum coded units (MCUs) apart the stream state resets.
This can be used for error correction, decoding parts of a jpeg
in parallel, etc.

We tried to use

    u32 i = vcursor * context.mblock_meta.hpadded_count + hcursor;
    i % (context.dc_restart_interval *
         context.sampling_factors.vertical *
         context.sampling_factors.horizontal) == 0

to check if we hit a multiple of an MCU.

`hcursor` is the horizontal offset into 8x8 blocks, vcursor the
vertical offset, and hpadded_count stores how many 8x8 blocks
we have per row, padded to a multiple of the sampling factor.

This isn't quite right if hcursor isn't divisible by both
the vertical and horizontal sampling factor. Tweak things so
that they work.

Also rename `i` to `number_of_mcus_decoded_so_far` since that
what it is, at least now.

For the test case, I converted an existing image to a ppm:

    Build/lagom/bin/image -o out.ppm \
        Tests/LibGfx/test-inputs/jpg/12-bit.jpg

Then I resized it to 102x77px in Photoshop and saved it again.
Then I turned it into a jpeg like so:

    path/to/cjpeg \
        -outfile Tests/LibGfx/test-inputs/jpg/odd-restart.jpg \
        -sample 2x2,1x1,1x1 -quality 5 -restart 3B out.ppm

The trick here is to:

a) Pick a size that's not divisible by the data size width (8),
   and that when rounded to a block size (13) still isn't divisible
   by the subsample factor -- done by picking a width of 102.
b) Pick a huffman table that doesn't happen to contain the bit
   pattern for a restart marker, so that reading a restart marker
   from the bitstream as data causes a failure (-quality 5 happens
   to do this)
c) Pick a restart interval where we fail to skip it if our calculation
   is off (-restart 3B)

Together with #22987, fixes #22780.
2024-01-30 14:50:43 +01:00
Nico Weber
1ed9e597a9 LibGfx/JPEGLoader: Use 1 / sqrt(8) instead of rsqrt(8)
At least on arm64, rsqrt(8) has noticeably worse precision:
https://github.com/SerenityOS/serenity/issues/22739#issuecomment-1912909835
2024-01-30 10:02:33 +01:00
Nico Weber
6971ba35d5 LibGfx+Tests: Support grayscale jpegs with 2x2 sampling and MCU reset
Non-interleaved files always have an MCU of one data unit.

(A "data unit" is an 8x8 tile of pixels, and an "MCU" is a
"minium coded unit", e.g. 2x2 data units for luminance and
1 data unit each for Cr and Cb for a YCrCb image with
4:2:0 subsampling.)

For the test case, I converted an existing image to a ppm:

    Build/lagom/bin/image -o out.ppm \
        Tests/LibGfx/test-inputs/jpg/12-bit.jpg

Then I converted it to grayscale and saved it as a pgm in Photoshop.
Then I turned it into a weird jpeg like so:

    path/to/cjpeg \
        -outfile Tests/LibGfx/test-inputs/jpg/grayscale_mcu.jpg \
        -sample 2x2 -restart 3 out.pgm

Makes 3 of the 5 jpegs failing to decode at #22780 go.
2024-01-30 05:35:22 +01:00
Nico Weber
002eb0ad03 LibGfx/JPEGLoader: Use ceil_div
No behavior change.
2024-01-29 13:33:10 -05:00
Nico Weber
393fb1f7b9 LibGfx/JPEGLoader: Pass SamplingFactors instead of Component to function
That's all this function reads from Component.

Also rename from validate_luma_and_modify_context() to
validate_sampling_factors_and_modify_context().

No behavior change.
2024-01-29 13:29:44 -05:00
Nico Weber
d89e42902e LibGfx/JPEGLoader: Extract inverse_dct_8x8() function
No behavior change.
2024-01-27 10:21:33 +00:00
Lucas CHOLLET
09b2b3539b LibGfx/JPEG+CMYKBitmap: Extract the CMYK to RGB conversion code
Right now, the JPEG decoder is the only one supporting CMYK, but that
won't last for long. So, let's move the conversion to CMYKBitmap.
2024-01-24 22:16:22 -07:00
Lucas CHOLLET
0462858247 LibGfx/JPEG: Keep the original CMYK data in memory
When decoding a CMYK image and asked for a normal `frame()`, the decoder
would convert the CMYK bitmap into an RGB bitmap. Calling `cmyk_frame()`
after that point will provoke a null-dereference.
2024-01-24 22:16:22 -07:00
Lucas CHOLLET
6eb574a2b6 LibGfx/JPEG: Expose the Exif metadata
The Exif metadata is contained in the APP1 segment. We only need to
call the TIFF decoder to get the metadata back :^).
2024-01-22 20:16:32 -07:00
Nico Weber
cf95910ae2 LibGfx/JPEG: Simplify loops walking all pixels in all macroblocks
When we want to walk everything, we can just do a linear walk.

No behavior change.
2024-01-15 23:04:56 -07:00
Nico Weber
5efe38ccd7 LibGfx/JPEG: Remove a silly initializer
SamplingFactors already has default initializers for its field,
so no need to have an explicit one for the first of the two fields.

No behavior change.
2024-01-15 23:04:56 -07:00
Nico Weber
fbde901614 LibGfx: Use read_effective_chunk_size() in skip_segment()
We missed this one in d184e6014ccd8.

No behavior change in valid JPEGs. No silent underflow in invalid ones.
2024-01-15 19:46:03 +00:00
Nico Weber
3616d14c80 LibGfx/JPEG: Allow decoding more subsampling factors
We now allow all subsampling factors where the subsampling factors
of follow-on components evenly decode the ones of the first component.

In practice, this allows YCCK 2111, CMYK 2112, and CMYK 2111.
2024-01-15 11:20:11 -07:00
Nico Weber
d99d086da3 LibGfx/JPEG: Move subsample-undoing to a separate function
Previously, we handled sampling factors as part of ycbcr_to_rgb().
That meant it worked ok for code paths that used YCbCr ("normal"
jpegs, and the YCC part of YCCK jpegs), but it didn't work for
example for the K channel in YCCK jpegs, nor for CMYK.

By making this a separate pass, it should now work for all cases.
It also makes it easier to support more subsampling arrangements
in the future, and to use something better than nearest neighbor
for upsampling subsampled blocks.
2024-01-15 11:20:11 -07:00
Nico Weber
d8ada20bae LibGfx: Allow images to report that they are originally grayscale
...and implement it in JPEGLoader.

Since it's easy to get the grayscale data off a Bitmap, don't
add a grayscale_frame() accessor.
2024-01-10 09:39:00 +01:00
Nico Weber
239da5132d LibGfx/JPEG: Make it possible to obtain raw CMYK data from JPEGs
frame() still returns a regular RGB Bitmap (now lazily converted
from internal CMYK data), but JPEGImageDecoderPlugin now also
implements cmyk_frame().
2024-01-10 09:39:00 +01:00
Lucas CHOLLET
52ce887b80 LibGfx/JPEG: Introduce a struct to hold sampling factors 2024-01-06 09:08:59 +00:00
Lucas CHOLLET
6c85f937ef LibGfx/JPEG: Print debug information on subsampling for each component 2024-01-06 09:08:59 +00:00
Lucas CHOLLET
145672a8b4 LibGfx/JPEG: Print some debug information about restart markers 2024-01-06 09:08:59 +00:00
Nico Weber
9c5a75067f LibGfx/JPEG: Reject ycck or cmyk jpegs with k subsampled for now
The decoder assumes that k's sampling factor matches y's at the moment.
Better to error out than to silently render something broken.

For ycck, covered by ycck-2111.jpg in the tests.
2023-12-29 18:55:57 +01:00
Nico Weber
a2bd19fdac LibGfx/JPEG: Make ycck jpegs with just cc subsampled decode correctly
We currently assume that the K (black) channel uses the same sampling
as the Y channel already, so this already works as long as we don't
error out on it.
2023-12-29 09:45:31 -05:00
Nico Weber
3c39c18440 LibGfx/JPEG: Add spec comment to read_start_of_frame() 2023-12-27 13:38:25 +00:00
Nico Weber
bfe27228a3 LibPDF+LibGfx: Don't invert CMYK channels in JPEG data in PDFs
This is a hack: Ideally we'd have a CMYK Bitmap pixel format,
and we'd convert to rgb at blit time. Then we could also apply color
profiles (which for CMYK images are CMYK-based).

Also, the colors for our CMYK->RGB conversion are off for PDFs,
and we have distinct codepaths for this in Gfx::Color (for paths)
and JPEGs. So when we fix that, we'll have to fix it in two places.

But this doesn't require a lot of code and it's a huge visual
progression, so let's go with it for now.
2023-11-17 22:32:40 +00:00
Tim Ledbetter
438e9e146c LibGfx/JPEG: Refill reservoir if necessary when discarding bits
This condition was hit 157 times out of the 109,233 JPEG images in the
Govdocs1 corpus. This change allows all of these
images to load correctly.
2023-11-05 09:01:15 +01:00
Tim Ledbetter
9ed8c0b183 LibGfx/JPEG: Propagate errors when creating JPEGLoadingContext
This allows the JPEG fuzzer to make progress.
2023-10-29 21:39:29 +01:00
Tim Ledbetter
023309fdc4 LibGfx/JPEGLoader: Check array access bounds when building lookup table 2023-10-20 07:17:27 +02:00
Tim Ledbetter
dd81bea9ef LibGfx: Don't read past EOF in JPEGLoader
Previously, it was possible to pass JPEGLoader a crafted input which
would read past the end of the stream. We now return an error in such
cases.
2023-10-02 20:09:25 +02:00
MacDue
bbf66ea055 LibGfx: Remove maximum size limit for decoded images
It is unlikely this is needed anymore, and as pointed out things should
now safely return OOM if the bitmap is too large to allocate.

Also, no recently added decoders respected this limit anyway.

Fixes #20872
2023-09-03 14:36:54 +02:00
Lucas CHOLLET
338d64abd9 LibGfx/JPEG: Don't fail to decode images with non-compliant ICC profile
Some images (like https://www.w3.org/Press/Stock/Berners-Lee/2001-europaeum-eighth.jpg)
embed a non-compliant ICC profile. Instead of rejecting the image, we
can simply discard the color profile and resume the decoding of the
bitmap.
2023-07-30 05:10:08 +02:00
Lucas CHOLLET
806808f406 LibGfx: Provide a default implementation for animation-related methods
Most image decoders that we have only support non-animated images,
providing a default implementation for them allows to remove quite some
code.
2023-07-18 14:34:35 +01:00
Lucas CHOLLET
1ae772c7d3 LibGfx/JPEG: Decode the header in create()
This is done as a part of #19893.
2023-07-14 06:16:56 +02:00
Lucas CHOLLET
2f125b86aa LibGfx/JPEG: Remove the unused method initialize() 2023-07-14 06:16:56 +02:00
Lucas CHOLLET
e5b70837de LibGfx: Remove ImageDecoder::set_[non]volatile()
These methods are unused so let's remove them.
2023-07-08 01:45:46 +01:00
Lucas CHOLLET
243e91f5de LibGfx/JPEG: Don't return an error on empty icc profiles
An empty icc profile shouldn't stop our brave decoder.
2023-07-05 17:41:17 +01:00
Lucas CHOLLET
0ef61ab895 LibGfx/JPEG: Convert some west-consts to east-consts 2023-07-05 12:10:56 +01:00
Lucas CHOLLET
af14ed6b2e LibGfx/JPEG: Accept grayscale images with an App14 segment
The adobe specification doesn't even consider JPEG images with a single
component. So let's not consider the content of the App14 segment for
grayscale images.
2023-07-05 12:10:56 +01:00
MacDue
e7cddda7e1 LibGfx: Allow passing an ideal size to image decoders
The ideal size is the size the user will display the image. Raster
formats should ignore this parameter, but vector formats can use
it to generate a bitmap of the ideal size.
2023-07-03 23:54:51 +02:00
Lucas CHOLLET
f52e3e540f LibGfx/JPEG: Add a fast path for sequential JPEGs
Decoding progressive JPEGs involves a much more complicated logic than
sequential JPEGs. Thanks to template specialization, this patch allow us
to skip the additional cost of progressive images when it's not needed.

It gives a nice 10% improvements on sequential JPEGs :^)
2023-07-03 14:26:15 +02:00
Lucas CHOLLET
503720b574 LibGfx/JPEG: Put generic definitions in a shared header
That file holds information that are used by both a decoder and an
encoder.
2023-06-22 21:13:04 +02:00
Lucas CHOLLET
e252b6e258 LibGfx/JPEG: Use a more aggressive inline policy
Some of these functions can be called millions of times even for images
of moderate size. Inlining these functions really helps the compiler and
gives performance improvements up to 10%.
2023-06-10 07:30:02 +02:00
Lucas CHOLLET
5a0d702f21 LibGfx/JPEG: Avoid the copy of each scan
Inside each scan, raw data is read with the following rules:
- Each `0x00` that is preceded by `0xFF` should be discarded
- If multiple `0xFF` are following, only consider one.

That, plus the fact that we don't know the size of the scan beforehand
made us put a prepared stream in a vector for an easy, later on, usage.

This patch remove this duplication by realizing these operations in a
stream-friendly way.
2023-06-10 07:30:02 +02:00
Lucas CHOLLET
9a317267de LibGfx/JPEG: Make JPEGLoadingContext::current_scan optional
This allows us to construct a context without calling `Scan`'s
constructor.
2023-06-10 07:30:02 +02:00
Lucas CHOLLET
df63e14da7 LibGfx/JPEG: Introduce JPEGStream
This class is similar to some extent to an implementation of `Stream`,
but specialized for JPEG reading.

Technically, it is composed of a `Vector` to bufferize the input and
provides a simple interface to read bytes.

This patch aims for a future better and specialized control over the
input data. The encapsulation already allows us to get rid of the last
`seek` in the decoder, thus the ability to use the decoder with a raw
`Stream`.

As it provides faster `read` routines, this patch already reduces time
spend on reading.
2023-06-10 07:30:02 +02:00
Lucas CHOLLET
9e6d91032e LibGfx/JPEG: Use Error to propagate errors
This patch removes a long time remnant of the pre-`AK::Error` era.
2023-06-02 20:07:27 +02:00