This tries to optimize the refill code by making it easier to digest for
the branch predictor. This includes not looping as much across function
calls and marking our EOF case to be unlikely.
Co-Authored-By: Lucas Chollet <lucas.chollet@free.fr>
See identical code in LittleEndianBitStream; even in the bytewise
reading BigEndianBitStream an offset of 8 is not inconsistent state and
handled just fine by read_bits.
Rather than tracking our position in the bit buffer, we can simply shift
away the bits that we read. This is mostly for simplicity, but also does
help performance a bit.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), compression time
decreases from:
3.96s to 3.79s on Serenity (cold)
1.08s to 1.04s on Serenity (warm)
0.83s to 0.82s on Linux
Our current `peek_bits` function allows retrieving more bits than we can
actually provide, so whenever someone discards the requested bit count
afterwards we were underflowing the value instead.
This is very similar to the LittleEndianInputBitStream bit buffer change
from 8e834d4bb2.
We currently buffer one byte of data for the underlying stream. And when
we put bits onto that buffer, we do so 1 bit at a time.
This replaces the u8 buffer with a u64. And instead of looping at all,
we perform bitwise operations to write the desired number of bits.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), compression time
decreases from:
13.62s to 10.9s on Serenity (cold)
13.62s to 9.22s on Serenity (warm)
2.93s to 2.32s on Linux
One caveat is that this requires explicitly flushing any leftover bits
when the caller is done with the stream. The byte buffer implementation
implicitly flushed its data every time the buffer was byte-aligned, as
doing so would always fill the byte. This is no longer the case. But for
now, this should be fine as the one user of this class, DEFLATE, already
has a "flush everything now that we're done" finalizer.
Else:
AK/BitStream.h:218:24:
error: inline function '...::lsb_mask<unsigned char>' is not
defined [-Werror,-Wundefined-inline]
static constexpr T lsb_mask(T bits)
^
We current buffer one byte of data from the underlying stream. And when
we pull bits off that buffer, we do so 1 or 8 bits at a time (depending
on whether the buffer is byte aligned). The 1-bit-at-a-time loop is by
far the most common during e.g. GZIP decompression.
This replaces the u8 buffer with a u64. And instead of looping at all,
we perform bitwise operations to extract the desired number of bits.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
242s to 35s on Serenity
11.125s to 3.527s on Linux
Note that BigEndianInputBitStream can also use the same techniques,
and some of the methods here may make sense to live in an endianness-
agnostic base class. The focus is GZIP right now though, which only
uses the little endian stream.
Similar to POSIX read, the basic read and write functions of AK::Stream
do not have a lower limit of how much data they read or write (apart
from "none at all").
Rename the functions to "read some [data]" and "write some [data]" (with
"data" being omitted, since everything here is reading and writing data)
to make them sufficiently distinct from the functions that ensure to
use the entire buffer (which should be the go-to function for most
usages).
No functional changes, just a lot of new FIXMEs.
This patch adds the `USING_AK_GLOBALLY` macro which is enabled by
default, but can be overridden by build flags.
This is a step towards integrating Jakt and AK types.
BigEndianInputBitStream is the Core::Stream API's bitwise input stream
for big endian input data. The functionality and bitwise read API is
almost unchanged from AK::BitStream, except that this bit stream only
supports big endian operations.
As the behavior for mixing big endian and little endian reads on
AK::BitStream is unknown (and untested), it was never done anyways. So
this was a good opportunity to split up big endian and little endian
reading.
Another API improvement from AK::BitStream is the ability to specify
the return type of the bit read function. Always needing to static_cast
the result of BitStream::read_bits_big_endian into the desired type is
adding a lot of avoidable noise to the users (primarily FlacLoader).
The existing InputBitStream methods only read in little endian, as this
is what the rest of the system requires. Two new methods allow the input
bitstream to read bits in big endian as well, while using the existing
state infrastructure.
Note that it can lead to issues if little endian and big endian reads
are used out of order without aligning to a byte boundary first.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
In the case that both the stream and the wrapped substream had errors
to be handled only one of the two would be resolved due to boolean
short circuiting. this commit ensures both are handled irregardless
of one another.
This ensures that when a DeflateCompressor stream is cleared of any
errors its underlying wrapped streams (InputBitStream/InputMemoryStream)
will be cleared as well and wont fail a VERIFY on destruction.
Consider the following snippet:
void foo(InputStream& stream) {
if(!stream.eof()) {
u8 byte;
stream >> byte;
}
}
There is a very subtle bug in this snippet, for some input streams eof()
might return false even if no more data can be read. In this case an
error flag would be set on the stream.
Until now I've always ensured that this is not the case, but this made
the implementation of eof() unnecessarily complicated.
InputFileStream::eof had to keep a ByteBuffer around just to make this
possible. That meant a ton of unnecessary copies just to get a reliable
eof().
In most cases it isn't actually necessary to have a reliable eof()
implementation.
In most other cases a reliable eof() is avaliable anyways because in
some cases like InputMemoryStream it is very easy to implement.
The streaming operator doesn't short-circuit, consider the following
snippet:
void foo(InputStream& stream) {
int a, b;
stream >> a >> b;
}
If the first read fails, the second is called regardless. It should be
well defined what happens in this case: nothing.