Commit graph

295 commits

Author SHA1 Message Date
Tim Ledbetter
b3d5f9748a LibVideo/VP9: Ensure range decoder size is within expected range 2023-11-08 09:13:59 +01:00
Tim Ledbetter
9788576936 LibVideo/VP9: Ensure color space is not set to reserved value 2023-10-11 14:35:47 -04:00
Tim Ledbetter
068f6771ad LibVideo/VP9: Check for invalid subsampled block sizes
Previously, a corrupted block could cause
`Parser::get_subsampled_block_size()` to return an invalid value. We
now return an error in this case.
2023-10-10 23:47:13 +01:00
Tim Ledbetter
569e7173cc LibVideo/VP9: Avoid integer overflow during in place butterfly rotation 2023-10-10 23:47:13 +01:00
Tim Ledbetter
fd3837c63b LibVideo/VP9: Return error for frames with invalid subsampling format
Previously, the program would crash if this condition was encountered.
We now return a decoder error allowing for graceful failure.
2023-10-10 23:47:13 +01:00
kleines Filmröllchen
062e0db46c LibCore: Make MappedFile OwnPtr-based
Since it will become a stream in a little bit, it should behave like all
non-trivial stream classes, who are not primarily intended to have
shared ownership to make closing behavior more predictable. Across all
uses of MappedFile, there is only one use case of shared mapped files in
LibVideo, which now uses the thin SharedMappedFile wrapper.
2023-09-27 03:22:56 +02:00
Timothy Flynn
7c4b0b0389 LibGfx+Userland: Rename Size::scaled_by to Size::scaled
Ignoring Size for a second, we currently have:

    Rect::scale_by
    Rect::scaled

    Point::scale_by
    Point::scaled

In Size, before this patch, we have:

    Size::scale_by
    Size::scaled_by

This aligns Size to use the same method name as Rect and Point. While
subjectively providing API symmetry, this is mostly to allow using this
method in templated helpers without caring what the exact underlying
type is.
2023-08-17 09:57:30 -04:00
Andreas Kling
ddbe6bd7b4 Userland: Rename Core::Object to Core::EventReceiver
This is a more precise description of what this class actually does.
2023-08-06 20:39:51 +02:00
Sam Atkins
3f7d97f098 AK+Libraries: Remove FixedMemoryStream::[readonly_]bytes()
These methods are slightly more convenient than storing the Bytes
separately. However, it it feels unsanitary to reach in and access this
data directly. Both of the users of these already have the
[Readonly]Bytes available in their constructors, and can easily avoid
using these methods, so let's remove them entirely.
2023-07-30 19:32:52 +01:00
Lucas CHOLLET
18b7ddd0b5 AK: Rename the const overload of FixedMemoryStream::bytes()
Due to overload resolutions rules, this simple code provokes a crash:

ReadonlyBytes readonly_bytes{};
FixedMemoryStream stream{readonly_bytes};
ReadonlyBytes give_them_back{stream.bytes()};
    // -> Panics on VERIFY(m_writing_enabled);
    // but this is fine:
auto bytes = static_cast<FixedMemoryStream const&>(*stream).bytes()

If we need to be explicit about it, let's rename the overload instead of
adding that `static_cast`.
2023-07-27 14:40:00 +01:00
Timothy Flynn
c911781c21 Everywhere: Remove needless trailing semi-colons after functions
This is a new option in clang-format-16.
2023-07-08 10:32:56 +01:00
Zaggy1024
888af08638 LibVideo/VP9: Implement lossless residual decoding 2023-07-07 06:44:04 -04:00
Zaggy1024
1789905d4a LibVideo/PlaybackManager: Don't crash when demuxer seek throws an error
`seek_demuxer_to_most_recent_keyframe()` wasn't correctly returning in
cases where an error was thrown by the demuxer. To avoid this, the
function now returns the error, and the playback state handler must act
on it instead, allowing it to exit the seeking state early.
2023-06-25 20:35:37 -04:00
Zaggy1024
cf1cb04af0 LibVideo/Matroska: Don't choke on files containing CRC32 elements
The EBML specification allows for CRC32 elements to be placed as the
first child element of a master element. However, our parsing of master
elements didn't take that into account, so an error would be thrown.

Instead of erroring out, the `parse_master_element()` function will now
skip CRC32 elements that are found as the first child of a master
element. If it is found after the first child, that will be considered
an error.

Void elements will also be skipped by `parse_master_element()`.

Since the `parse_cluster()` function has to seek the stream back to the
cluster's first child in order to allow cues' positions to be used
correctly, `parse_master_element()` had to be changed to return the
first element position, since the callback is not invoked for CRC32
elements. This means that the parameter used to communicate the element
position to the child element parsing function is unused, so that is
removed.
2023-06-25 20:27:02 -04:00
Zaggy1024
24ae35086d LibGfx/LibVideo: Check for overreads only at end of a VPX range decode
Errors are now deferred until `finish_decode()` is finished, meaning
branches to return errors only need to occur at the end of a ranged
decode. If VPX_DEBUG is enabled, a debug message will be printed
immediately when an overread occurs.

Average decoding times for `Tests/LibGfx/test-inputs/4.webp` improve
by about 4.7% with this change, absolute decode times changing from
27.4ms±1.1ms down to 26.1ms±1.0ms.
2023-06-10 07:17:12 +02:00
Zaggy1024
873b0e9470 LibGfx/LibVideo: Read batches of multiple bytes in VPX BooleanDecoder
This does a few things:

- The decoder uses a 32- or 64-bit integer as a reservoir of the data
  being decoded, rather than one single byte as it was previously.
- `read_bool()` only refills the reservoir (value) when the size drops
  below one byte. Previously, it would read out a bit-sized range from
  the data to completely refill the 8-bit value, doing much more work
  than necessary for each individual read.
- VP9-specific code for reading the marker bit was moved to its own
  function in Context.h.
- A debug flag `VPX_DEBUG` was added to optionally enable checking of
  the final bits in a VPX ranged arithmetic decode and ensure that it
  contains all zeroes. These zeroes are a bitstream requirement for
  VP9, and are also present for all our lossy WebP test inputs
  currently. This can be useful to test whether all the data present in
  the range has been consumed.

A lot of the size of this diff comes from the removal of error handling
from all the range decoder reads in LibVideo/VP9 and LibGfx/WebP (VP8),
since it is now checked only at the end of the range.

In a benchmark decoding `Tests/LibGfx/test-inputs/4.webp`, decode times
are improved by about 22.8%, reducing average runtime from 35.5ms±1.1ms
down to 27.4±1.1ms.

This should cause no behavioral changes.
2023-06-10 07:17:12 +02:00
Ben Wiederhake
6a351376aa Everywhere: Only use local includes where appropriate
If a local include does not point to a file in the repository, it should
be a system include instead.
2023-06-06 23:19:50 +02:00
Zaggy1024
ce9f4c3215 LibVideo/PlaybackManager: Return duration zero when an error occurs
Previously, we would unwrap the duration value even when it contained
an error, causing an assertion failure.

Later, we should change it to return the last known sample timestamp
in the media data, allowing for example live-streamed video to have
a semi-useful duration. In general, though, this is not used by
players in the wild, so we can leave it for now.
2023-05-31 13:08:48 +02:00
Nico Weber
1dfb065a9c LibGfx+LibVideo: Make BooleanDecoder usable for both VP8 and VP9
The marker bit is VP9-only, so move that into a new initialize_vp9()
function.

finish_decode() is VP9-only, so rename that to finish_decode_vp9().
2023-05-27 05:47:42 +02:00
Nico Weber
fbc53c1ec3 LibGfx+LibVideo: Move VP9/BooleanDecoder to LibGfx/ImageFormats
...and keep a forwarding header around in VP9, so we don't have to
update all references to the class there.

In time, we probably want to merge LibGfx/ImageDecoders and LibVideo
into LibMedia, but for now I need just this class for the lossy
webp decoder. So move just it over.

No behvior change.
2023-05-27 05:47:42 +02:00
kleines Filmröllchen
fc5cab5c21 Everywhere: Use MonotonicTime instead of Duration
This is easily identifiable by anyone who uses Duration::now_monotonic,
and any downstream users of that data.
2023-05-24 23:18:07 +02:00
kleines Filmröllchen
213025f210 AK: Rename Time to Duration
That's what this class really is; in fact that's what the first line of
the comment says it is.

This commit does not rename the main files, since those will contain
other time-related classes in a little bit.
2023-05-24 23:18:07 +02:00
Nico Weber
da00dadfe7 LibVideo: Rename local variable from "max_bits" to "bits_left"
That's how the member variable it gets copied into is called.

No behavior change.
2023-05-24 04:11:05 -07:00
Nico Weber
e43c21f4d7 LibVideo: Rename BooleanDecoder::initialize param
...from "bytes" to "size_in_bytes".

I thought at first that `bytes` meant the variable contained
bytes that the decoder would read from.

Also, this variable is called `sz` (for `size`) in the spec.

No behavior change.
2023-05-24 04:11:05 -07:00
Nico Weber
35883c337f LibVideo: Remove unused BooleanDecoder::bits_remaining() 2023-05-24 04:11:05 -07:00
Zaggy1024
e31f696ee6 LibVideo/PlaybackManager: Use a function to start the internal timer
In addition, this renames the internal timer to better suit its new
purpose since the playback state handlers were added. Not only is it
used to time frame presentations, but also to poll the queue when
seeking or buffering.
2023-05-23 05:17:48 -06:00
Zaggy1024
71d70df34f LibVideo/PlaybackManager: Decode frames off the main thread
Running decoding on the main thread can cause frames to be delayed due
to the presentation timer waiting for a decode to finish. If a frame
takes too long to decode, this delay can be quite visible. Therefore,
decoding has been moved to a separate thread, with frames sent to the
main thread through a thread-safe queue.

This results in frame times going from being late by up to 16ms to a
very consistent ~1-2ms.
2023-05-23 05:17:48 -06:00
Zaggy1024
eae7422ebc LibVideo: Fallibly construct playback manager fields 2023-05-23 05:17:48 -06:00
Caoimhe
465fa3460f LibVideo: Add PlaybackManager::from_mapped_file
This allows us to create a PlaybackManager from a file which has already
been mapped, instead of passing a file name.

This means that anyone who uses `PlaybackManager` can now use LibFSAC :)
2023-05-14 15:56:49 -06:00
Ben Wiederhake
ee47c0275e Everywhere: Run spellcheck on all documentation 2023-05-07 01:05:09 +02:00
Zaggy1024
1422f7f904 LibVideo/VP9: Revert framebuffer size reduction to allow OOB blocks
The framebuffer size was reduced in f2c0cee, but this caused some niche
block layouts to write outside of the frame.

This could be fixed by adding checks to see if a block being predicted/
reconstructed is within the frame, but the branches introduced by that
reduce performance slightly. Therefore, it's better to keep the
framebuffer sized according to the decoded frame size in 8x8 blocks so
that any block can be decoded without bounds checking.

A test was added to ensure that this continues to work.
2023-05-02 07:00:46 -04:00
Zaggy1024
2ec043c4db LibVideo/VP9: Make inter-prediction fast path accumulators 32-bit
Some occasional cases could cause the accumulator to overflow and have
an incorrect result. It would be nice to use a smaller accumulator, but
it seems not to be correct. :^(

We now cast to i16 to allow 128-bit vectorization to make use of one
whole register instead of having to split the loop into multiple.

This results in about a 5% reduction in performance in my testing.
2023-04-30 05:58:27 +02:00
Zaggy1024
b10da81c7c LibVideo: Fast-path converting colors by only matrix coefficients
We don't need to run through the whole floating-point color converter
for videos that use sRGB transfer characteristics and BT.709 color
primaries. This commit adds a new templated inlining function to
ColorConverter to do a very fast fixed-point YCbCr to RGB conversion.

With the fast path, frame conversion times go from ~7.8ms down to
~3.7ms. The fast path can benefit a lot more from extra SIMD vector
width, as well.
2023-04-25 17:44:36 -04:00
Zaggy1024
d6b867ba89 LibVideo/VP9: Force inlining of inverse_transform_2d() and the IDCT
Clang was reluctant to inline these for some reason. However, inlining
them seems to be quite beneficial, reducing decoding time in an intra-
heavy video by about 21% (~12.7s -> ~10.0s).
2023-04-25 17:44:36 -04:00
Zaggy1024
90c0e1ad8f LibVideo/VP9: Pre-calculate the quantizers at the start of each frame
Quantizers are a constant for the whole frame, except when segment
features override them, in which case they are a constant per segment
ID. We take advantage of this by pre-calculating those after reading
the quantization parameters and segmentation features for a frame.
This results in a small 1.5% improvement (~12.9s -> ~12.7s).
2023-04-25 17:44:36 -04:00
Zaggy1024
094b0d8a78 LibVideo/VP9: Use an enum to select segment features
This throws out some ugly `#define`s we had that were taking the role
of an enum anyway. We now have some nice getters in the contexts that
take the place of the combo of `seg_feature_active()` and then doing a
lookup in `FrameContext::m_segmentation_features` directly.
2023-04-25 17:44:36 -04:00
Zaggy1024
6e6cc1ddb2 LibVideo/VP9: Make a lookup table for bit reversals
Bit reversals are used very often in intra-predicted frames. Turning
these into a constexpr lookup table reduces the branching needed for
block transforms significantly. This reduces the times spent decoding
an intra-heavy 1080p video by about 9% (~14.3s -> ~12.9s).
2023-04-25 17:44:36 -04:00
Zaggy1024
f6764beead LibVideo/VP9: Specialize transforms on their block size
Previously, the block sizes would be checked at runtime to
determine the transform size to apply for residuals. Making the block
sizes into constant expressions allows all the loops to be unrolled
and reduces branching significantly.

This results in about a 26% improvement (~18s -> ~13.2s) in speed in an
intra-heavy test video.
2023-04-25 17:44:36 -04:00
Zaggy1024
5b4c1056f1 LibVideo/Color: Always inline convert_yuv_to_full_range_rgb()
Inlining the color conversion reduces time spent for frame conversions
in a 1080p video from ~12ms down to ~9ms.
2023-04-25 17:44:36 -04:00
Zaggy1024
8ad0dff5c2 LibVideo/VP9: Implement unscaled fast paths in inter prediction
Inter-prediction convolution filters are selected based on the
subpixel position determined for the motion vector relative to the
block being predicted. The subpixel position 0 only uses one single
sample in the center of the convolution, not averaging any other
samples. Let's call this a copy.

Reference frames can also be a different size relative to the frame
being predicted, but in almost every case, that scale will be 1:1
for every single frame in a video.

Taking into account these facts, we can create multiple fast paths for
inter prediction. These fast paths are only active when scaling is 1:1.

If we are doing a copy in both dimensions, then we can do a straight
memcpy from the reference frame to the output block buffer. In videos
where there is no motion, this is a dramatic speedup.

If we are doing a copy in one dimension, we can just do one convolution
and average directly into the output block buffer.

If we aren't doing a copy in either dimension, we can still cut out a
few operations from the convolution loops, since we only need to
advance our samples by whole pixels instead of subpixels.

These fast paths result in about a 34% improvement (~31.2s -> ~20.6s)
in a video which relies heavily on intra-predicted blocks due to high
motion. In videos with less motion, the improvement will be even
greater.

Also, note that the accumulators in these faster loops are only 16-bit.
High bit-depth videos will overflow those, so for now the fast path is
only used for 8-bit videos.
2023-04-25 17:44:36 -04:00
Zaggy1024
8cd72ad1ed LibVideo/VP9: Use the Y scale value in predict_inter_block()
A typo caused the Y scale value to never be used, so if a reference
frame's aspect ratio didn't match up with the current frame's, it would
decode incorrectly.

Some comments have been added to clarify the frame-constants used in
the function as well.
2023-04-25 17:44:36 -04:00
Zaggy1024
7a58577fee LibVideo: Convert subsampled frames in a vectorization-friendly way
This change reduces the time spent converting frames for display in
a 1080p video from ~19ms per frame to ~12ms per frame.
2023-04-25 17:44:36 -04:00
Zaggy1024
f2c0cee522 LibVideo/VP9: Consolidate frame size calculations
This moves all the frame size calculation to `FrameContext`, where the
subsampling is easily accessible to determine the size for each plane.
The internal framebuffer size has also been reduced to the exact frame
size that is output.
2023-04-25 17:44:36 -04:00
Zaggy1024
57c7389200 LibVideo/VP9: Fix rounding of components in the motion vector selection
The division in the `round_mv_...()` functions contained in the motion
vector selection process was done by bit shifting right. However, since
bit shifting negative values will truncate towards the negative end, it
was flooring instead of rounding.

This changes it to match the spec and rely on the compiler to simplify
down to a bit shift.
2023-04-25 17:44:36 -04:00
Timothy Flynn
374a24d84c LibVideo: Remove hook to override LibVideo's playback timers
This was added to allow Ladybird to override the timers with Qt timers.
2023-04-25 18:02:22 +02:00
Zaggy1024
eba72fa3a7 LibVideo/VP9: Wait for workers to finish when there are decoding errors
Previously, the `Parser::decode_tiles()` function wouldn't wait for the
tile-decoding workers to finish before exiting the function, which
could mean that the data the threads are working with could become
invalid if the decoder is deleted after an error is encountered.
2023-04-25 06:35:13 -04:00
Timothy Flynn
f9d8e42636 LibVideo: Allocate Vector2D underlying storage with new, not malloc
Using malloc does not invoke T's constructor, nor were were invoking T's
constructor ourselves. Accessing T without invoking its constructor is
undefined behavior.
2023-04-25 02:11:11 -06:00
Timothy Flynn
8b7c5db186 LibVideo: Do not invoke Vector2D's destructor in order to resize it
This leads to all kinds of undefined behavior, especially when callers
have a view into the Vector2D.
2023-04-24 18:21:01 +02:00
Zaggy1024
036eb82aca LibVideo/VP9: Implement threaded tile column decoding
This adds a new WorkerThread class to run one task asynchronously,
and allow waiting for that thread to finish its work.

TileContexts are placed into multiple tile column vectors with their
streams to read from pre-created. Once those are ready, the threads can
start their work on each vector separately. The main thread waits for
those tasks to finish, then sums up the syntax element counts for each
tile that was decoded.
2023-04-23 23:14:30 +02:00
Zaggy1024
1fcac52e77 LibVideo/VP9: Count syntax elements in TileContext, and sum at the end
Syntax element counters were previously accessed across tiles, which
would cause a race condition updating the counts in a tile-threaded
mode.
2023-04-23 23:14:30 +02:00