Commit graph

20 commits

Author SHA1 Message Date
Tim Schumacher
60ac254df6 AK: Use hashing to accelerate searching a CircularBuffer 2023-07-06 15:06:20 +01:00
Tim Schumacher
4a10cf1506 AK: Make CircularBuffer::read_with_seekback const
Compared to the other read and write functions, this doesn't modify the
internal state of the circular buffer.
2023-07-06 15:06:20 +01:00
Tim Schumacher
42d01b21d8 AK: Rewrite the hint-based CircularBuffer::find_copy_in_seekback
This now searches the memory in blocks, which should be slightly more
efficient. However, it doesn't make much difference (e.g. ~1% in LZMA
compression) in most real-world applications, as the non-hint function
is more expensive by orders of magnitude.
2023-07-06 15:06:20 +01:00
Tim Schumacher
2109f61b0d AK: Add search-related helpers to CircularBuffer
This factors out a lot of complicated math into somewhat understandable
functions.

While at it, rename `next_read_span_with_seekback` to
`next_seekback_span` to keep the naming consistent and to avoid making
function names any longer.
2023-07-06 15:06:20 +01:00
Tim Schumacher
d12036132e AK: Allow passing an offset to CircularBuffer::next_read_span() 2023-07-06 15:06:20 +01:00
Tim Schumacher
046a9faeb3 AK: Split up CircularBuffer::find_copy_in_seekback
The "operation modes" of this function have very different focuses, and
trying to combine both in a way where we share the most amount of code
probably results in the worst performance.

Instead, split up the function into "existing distances" and "no
existing distances" so that we can optimize either case separately.
2023-07-06 15:06:20 +01:00
Tim Schumacher
9e82ad758e AK: Move parts for searching CircularBuffer into a new class
We will be adding extra logic to the CircularBuffer to optimize
searching, but this would negatively impact the performance of
CircularBuffer users that don't need that functionality.
2023-07-06 15:06:20 +01:00
Tim Schumacher
221b91ff61 AK: Add CircularBuffer::find_copy_in_seekback()
This is useful for compressors, which quite frequently need to find a
matching span of data within the seekback.
2023-05-17 09:08:53 +02:00
Lucas CHOLLET
48b000a36c AK: Add CircularBuffer::flush_to_stream
In a similar fashion to what have been done with `fill_from_stream`,
this new method allows to write CircularBuffer's data to a Stream
without additional copies.
2023-05-09 11:18:46 +02:00
Tim Schumacher
b1136ba357 AK: Efficiently resize CircularBuffer seekback copy distance
Previously, if we copied the last byte for a length of 100, we'd
recalculate the read span 100 times and memmove one byte 100 times,
which resulted in a lot of overhead.

Now, if we know that we have two consecutive copies of the data, we just
extend the distance to cover both copies, which halves the number of
times that we recalculate the span and actually call memmove.

This takes the running time of the attached benchmark case from 150ms
down to 15ms.
2023-04-14 10:03:42 +02:00
Tim Schumacher
b88c58b94c AK+LibCompress: Break when seekback copying to a full CircularBuffer
Otherwise, we just end up infinitely looping while waiting for more
space in the destination.
2023-04-05 07:30:38 -04:00
Tim Schumacher
767fb01a8c AK: Report copied bytes when seekback copying from CircularBuffer
Otherwise, we have no way of determining whether our copy was truncated
by accident.
2023-04-05 07:30:38 -04:00
Tim Schumacher
997e745e87 AK: Properly limit the internal seekback span for CircularBuffer
I was originally thinking in the wrong direction when adding this limit,
we can at most read from the buffer until we reach the current write
head. Since that write head is the reference point for the distance,
we need to limit ourselves to that instead of the seekback limit (which
is the maximum of how far back the distance can be).
2023-04-05 07:30:38 -04:00
Timothy Flynn
8b56d82865 AK+LibCompress: Remove the Deflate back-reference intermediate buffer
Instead of reading bytes from the output stream into a buffer, just to
immediately write them back out, we can skip the middle-man and copy the
bytes directly into the output buffer.
2023-03-31 06:56:11 +02:00
Timothy Flynn
5c38b14045 AK: Remove arbitrary 1 KB limit when filling a BufferedStream's buffer
When reading, we currently only fill a BufferedStream's buffer when it
is empty, and only with 1 KB of data. This means that while the buffer
defaults to a size of 16 KB, at least 15 KB is always unused.
2023-03-30 08:47:22 +02:00
Tim Schumacher
52d9fc92f1 AK: Expose the seekback limit of CircularBuffer 2023-03-21 10:25:13 +01:00
Lucas CHOLLET
3b824ec8c9 AK+Test: Fix a logic error in CircularBuffer::offset_of()
This error was introduced by 9a7accdd and had a significant impact on
`BufferedFile` behavior. Hence, we started seeing crash in test262.

By itself, the issue was a wrong calculation of the internal reading
spans when using the `read` and `until` parameters. Which can lead to
at worse crash in VERIFY and at least weird behaviors as missed needles
or detections out of bounds.

It was also accompanied by an erroneous test.

This patch fixes the bug, the test and also provides more tests.
2023-01-15 23:23:24 +00:00
Lucas CHOLLET
9a7accddb7 AK: Add an optional starting offset to CircularBuffer::offset_of
This parameter allows to start searching after an offset. For example,
to resume a search.

It is unfortunately a breaking change in API so this patch also modifies
one user and one test.
2023-01-14 16:20:30 -07:00
Tim Schumacher
d717a08003 AK: Add CircularBuffer::read_with_seekback 2023-01-13 17:34:45 -07:00
Lucas CHOLLET
f12e81b74a AK: Add CircularBuffer
The class is very similar to `CircularDuplexStream` in its behavior.
Main differences are that `CircularBuffer`:
 - does not inherit from `AK::Stream`
 - uses `ErrorOr` for its API
 - is heap allocated (and OOM-Safe)

 This patch also add some tests.
2022-12-31 04:44:17 -07:00