Usually the values of the previous and next pointers of deleted buckets
are never used, as they're not part of the main ordered bucket chain,
but if an in-place rehashing is done, which results in the bucket being
turned into a free bucket, the stale pointers will remain, at which
point any item that is inserted into said free-bucket will have either
a stale previous pointer if the HashTable was empty on insertion, or a
stale next pointer, resulting in undefined behaviour.
This commit also includes a new HashMap test that reproduces this issue
As seen on TV, HashTable can get "thrashed", i.e. it has a bunch of
deleted buckets that count towards the load factor. This means that hash
tables which are large enough for their contents need to be resized.
This was fixed in 9d8da16 with a workaround that shrinks the HashTable
back down in these cases, as after the resize and re-hash the load
factor is very low again. However, that's not a good solution. If you
insert and remove repeatedly around a size boundary, you might get
frequent resizes, which involve frequent re-allocations.
The new solution is an in-place rehashing algorithm that I came up with.
(Do complain to me, I'm at fault.) Basically, it iterates the buckets
and re-hashes the used buckets while marking the deleted slots empty.
The issue arises with collisions in the re-hash. For this reason, there
are two kinds of used buckets during the re-hashing: the normal "used"
buckets, which are old and are treated as free space, and the
"re-hashed" buckets, which are new and treated as used space, i.e. they
trigger probing. Therefore, the procedure for relocating a bucket's
contents is as follows:
- Locate the "real" bucket of the contents with the hash. That bucket is
the starting point for the target bucket, and the current (old) bucket
is the bucket we want to move.
- While we still need to move the bucket:
- If we're the target, something strange happened last iteration or we
just re-hashed to the same location. We're done.
- If the target is empty or deleted, just move the bucket. We're done.
- If the target is a re-hashed full bucket, we probe by double-hashing
our hash as usual. Henceforth, we move our target for the next
iteration.
- If the target is an old full bucket, we swap the target and to-move
buckets. Therefore, the bucket to move is a the correct location and the
former target, which still needs to find a new place, is now in the
bucket to move. So we can just continue with the loop; the target is
re-obtained from the bucket to move. This happens for each and every
bucket, though some buckets are "coincidentally" moved before their
point of iteration is reached. Either way, this guarantees full in-place
movement (even without stack storage) and therefore space complexity of
O(1). Time complexity is amortized O(2n) asssuming a good hashing
function.
This leads to a performance improvement of ~30% on the benchmark
introduced with the last commit.
Co-authored-by: Hendiadyoin1 <leon.a@serenityos.org>
The hash table buckets had three different state booleans that are in
fact exclusive. In preparation for further states, this commit
consolidates them into one enum. This has the added benefit on not
relying on the compiler's boolean packing anymore; we definitely now
only need one byte for the bucket state.
Since the allocated memory is going to be zeroed immediately anyway,
let's avoid redundantly scrubbing it with MALLOC_SCRUB_BYTE just before
that.
The latest versions of gcc and Clang can automatically do this malloc +
memset -> calloc optimization, but I've seen a couple of places where it
failed to be done.
This commit also adds a naive kcalloc function to the kernel that
doesn't (yet) eliminate the redundancy like the userland does.
If the utilization of a HashTable (size vs capacity) goes below 20%,
we'll now shrink the table down to capacity = (size * 2).
This fixes an issue where tables would grow infinitely when inserting
and removing keys repeatedly. Basically, we would accumulate deleted
buckets with nothing reclaiming them, and eventually deciding that we
needed to grow the table (because we grow if used+deleted > limit!)
I found this because HashTable iteration was taking a suspicious amount
of time in Core::EventLoop::get_next_timer_expiration(). Turns out the
timer table kept growing in capacity over time. That made iteration
slower and slower since HashTable iterators visit every bucket.
Just walk the table from start to finish, deleting buckets as we go.
This removes the need for remove() to return an iterator, which is
preventing me from implementing hash table auto-shrinking.
This was used in `HashMap::try_ensure_capacity`, but was missing from
`HashTable`s implementation. No one had used
`HashMap::try_ensure_capacity` before so it went unnoticed!
This will allow us to avoid some potentially expensive type conversion
during lookup, like form String to StringView, which would allocate
memory otherwise.
Example failure:
IDAllocator.h only pulls in AK/Hashtable.h, so any compilation unit that
includes AK/IDAllocator.h without including AK/Traits.h before it used
to be doomed to fail with the cryptic error message "In instantiation of
'AK::HashTable<T, TraitsForT, IsOrdered>::Iterator AK::HashTable<T,
TraitsForT, IsOrdered>::find(const T&) [with T = int; TraitsForT =
AK::Traits: incomplete type 'AK::Traits<int>' used in nested name
specifier".
This implements the macOS API malloc_good_size() which returns the
true allocation size for a given requested allocation size. This
allows us to make use of all the available memory in a malloc chunk.
For example, for a malloc request of 35 bytes our malloc would
internally use a chunk of size 64, however the remaining 29 bytes
would be unused.
Knowing the true allocation size allows us to request more usable
memory that would otherwise be wasted and make that available for
Vector, HashTable and potentially other callers in the future.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The old approach was more complex and also had a very bad edge case
with lots of collisions. This approach eliminates that possiblility.
It also makes both reading and writing lookups a little bit faster.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
Problem:
- Using regular functions rather than function templates results in
the arguments not being deduced. This then requires the same
function to be written multiple times and for `move` to be used
rather than `forward`.
Solution:
- Collapse multiple function overloads to a single function template
with a deduced argument. This allows the argument to be a forwarding
reference and bind to either an l-value or r-value and forward the
value.
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.
Double the capacity when used+deleted buckets crosses 60% of capacity.
This appears to be a sweet spot for performance based on some ad-hoc
testing with test-js. :^)
Instead of each hash bucket being a SinglyLinkedList, switch to using
closed hashing (open addressing). Buckets are chained together via
double hashing (hashing the hash until we find an unused bucket.)
This greatly reduces malloc traffic, since each added element no longer
allocates a new linked list node.
Appears performance neutral on test-js. Can definitely be tuned and
could use proper management of load factor, etc.
This allows performing an action based on whether something
was actually added or removed without having to look it up
prior to calling set() or remove().
This commit replaces SinglyLinkedListIterator::universal_end() with an
empty SinglyLinkedListIterator(). Piano needs this in order to
initialize a member array of iterators without 84 lines of
universal_end().