GzipCompressor is currently written assuming that it's write_some method
is only called once. When we use this class for LibWeb, we may very well
receive data to compress in small chunks. So this patch makes us write
the gzip header and footer only once, which now resembles the zlib and
deflate compressors.
OutputMemoryStream was originally a proxy for DuplexMemoryStream that
did not expose any reading API.
Now I need to add another class that is like OutputMemoryStream but only
for static buffers. My first idea was to make OutputMemoryStream do that
too, but I think it's much better to have a distinct class for that.
I originally wanted to call that class FixedOutputMemoryStream but that
name is really cumbersome and it's a bit unintuitive because
InputMemoryStream is already reading from a fixed buffer.
So let's just use DuplexMemoryStream instead of OutputMemoryStream for
any dynamic stuff and create a new OutputMemoryStream for static
buffers.
Consider the following snippet:
void foo(InputStream& stream) {
if(!stream.eof()) {
u8 byte;
stream >> byte;
}
}
There is a very subtle bug in this snippet, for some input streams eof()
might return false even if no more data can be read. In this case an
error flag would be set on the stream.
Until now I've always ensured that this is not the case, but this made
the implementation of eof() unnecessarily complicated.
InputFileStream::eof had to keep a ByteBuffer around just to make this
possible. That meant a ton of unnecessary copies just to get a reliable
eof().
In most cases it isn't actually necessary to have a reliable eof()
implementation.
In most other cases a reliable eof() is avaliable anyways because in
some cases like InputMemoryStream it is very easy to implement.
The streaming operator doesn't short-circuit, consider the following
snippet:
void foo(InputStream& stream) {
int a, b;
stream >> a >> b;
}
If the first read fails, the second is called regardless. It should be
well defined what happens in this case: nothing.
I suspected an error in CircularDuplexStream::read(Bytes, size_t). This
does not appear to be the case, this test case is useful regardless.
The following script was used to generate the test:
import gzip
uncompressed = []
for _ in range(0x100):
uncompressed.append(1)
for _ in range(0x7e00):
uncompressed.append(0)
for _ in range(0x100):
uncompressed.append(1)
compressed = gzip.compress(bytes(uncompressed))
compressed = ", ".join(f"0x{byte:02x}" for byte in compressed)
print(f"""\
TEST_CASE(gzip_decompress_repeat_around_buffer)
{{
const u8 compressed[] = {{
{compressed}
}};
u8 uncompressed[0x8011];
Bytes{{ uncompressed, sizeof(uncompressed) }}.fill(0);
uncompressed[0x8000] = 1;
const auto decompressed = Compress::GzipDecompressor::decompress_all({{ compressed, sizeof(compressed) }});
EXPECT(compare({{ uncompressed, sizeof(uncompressed) }}, decompressed.bytes()));
}}
""", end="")