I suspected an error in CircularDuplexStream::read(Bytes, size_t). This
does not appear to be the case, this test case is useful regardless.
The following script was used to generate the test:
import gzip
uncompressed = []
for _ in range(0x100):
uncompressed.append(1)
for _ in range(0x7e00):
uncompressed.append(0)
for _ in range(0x100):
uncompressed.append(1)
compressed = gzip.compress(bytes(uncompressed))
compressed = ", ".join(f"0x{byte:02x}" for byte in compressed)
print(f"""\
TEST_CASE(gzip_decompress_repeat_around_buffer)
{{
const u8 compressed[] = {{
{compressed}
}};
u8 uncompressed[0x8011];
Bytes{{ uncompressed, sizeof(uncompressed) }}.fill(0);
uncompressed[0x8000] = 1;
const auto decompressed = Compress::GzipDecompressor::decompress_all({{ compressed, sizeof(compressed) }});
EXPECT(compare({{ uncompressed, sizeof(uncompressed) }}, decompressed.bytes()));
}}
""", end="")
Now we have an actual stream implementation that can read arbitrary
(dynamic codes aren't supported yet) deflate encoded data. Even if
the blocks are really large.
And all of that happens with a single buffer of 32KiB. DEFLATE is
amazing!
Previously, the implementation would produce one Vector<u8> which
would contain the whole decompressed data. That can be a lot and
even exhaust memory.
With these changes it is still necessary to store the whole input data
in one piece (I am working on this next,) but the output can be read
block by block. (That's not optimal either because blocks can be
arbitrarily large, but it's good for now.)