beenull/ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-25 09:00:22 +00:00

Author	SHA1	Message	Date
Hendiadyoin1	5bf84a5b0e	AK: Zero previous pointer after fixing the insertion list in HashTable	2022-06-23 20:25:12 +03:00
Idan Horowitz	eb02425ef9	AK: Clear the previous and next pointers of deleted HashTable buckets Usually the values of the previous and next pointers of deleted buckets are never used, as they're not part of the main ordered bucket chain, but if an in-place rehashing is done, which results in the bucket being turned into a free bucket, the stale pointers will remain, at which point any item that is inserted into said free-bucket will have either a stale previous pointer if the HashTable was empty on insertion, or a stale next pointer, resulting in undefined behaviour. This commit also includes a new HashMap test that reproduces this issue	2022-06-22 21:53:13 +02:00
Vitaly Dyachkov	a0a4d169f4	AK+LibGUI: Pass predicate to *_matching() methods by const reference	2022-05-08 17:02:00 +02:00
Idan Horowitz	086969277e	Everywhere: Run clang-format	2022-04-01 21:24:45 +01:00
kleines Filmröllchen	09a12247fb	AK: Use bucket states with special bit patterns in HashTable This simplifies some of the bucket state handling code, as there's now an easy way of checking the basic category of bucket state.	2022-03-31 12:06:13 +02:00
kleines Filmröllchen	49d29c8298	AK: Rehash HashTable in-place instead of shrinking As seen on TV, HashTable can get "thrashed", i.e. it has a bunch of deleted buckets that count towards the load factor. This means that hash tables which are large enough for their contents need to be resized. This was fixed in `9d8da16` with a workaround that shrinks the HashTable back down in these cases, as after the resize and re-hash the load factor is very low again. However, that's not a good solution. If you insert and remove repeatedly around a size boundary, you might get frequent resizes, which involve frequent re-allocations. The new solution is an in-place rehashing algorithm that I came up with. (Do complain to me, I'm at fault.) Basically, it iterates the buckets and re-hashes the used buckets while marking the deleted slots empty. The issue arises with collisions in the re-hash. For this reason, there are two kinds of used buckets during the re-hashing: the normal "used" buckets, which are old and are treated as free space, and the "re-hashed" buckets, which are new and treated as used space, i.e. they trigger probing. Therefore, the procedure for relocating a bucket's contents is as follows: - Locate the "real" bucket of the contents with the hash. That bucket is the starting point for the target bucket, and the current (old) bucket is the bucket we want to move. - While we still need to move the bucket: - If we're the target, something strange happened last iteration or we just re-hashed to the same location. We're done. - If the target is empty or deleted, just move the bucket. We're done. - If the target is a re-hashed full bucket, we probe by double-hashing our hash as usual. Henceforth, we move our target for the next iteration. - If the target is an old full bucket, we swap the target and to-move buckets. Therefore, the bucket to move is a the correct location and the former target, which still needs to find a new place, is now in the bucket to move. So we can just continue with the loop; the target is re-obtained from the bucket to move. This happens for each and every bucket, though some buckets are "coincidentally" moved before their point of iteration is reached. Either way, this guarantees full in-place movement (even without stack storage) and therefore space complexity of O(1). Time complexity is amortized O(2n) asssuming a good hashing function. This leads to a performance improvement of ~30% on the benchmark introduced with the last commit. Co-authored-by: Hendiadyoin1 <leon.a@serenityos.org>	2022-03-31 12:06:13 +02:00
kleines Filmröllchen	bcb8937898	AK: Merge HashTable bucket state into one enum The hash table buckets had three different state booleans that are in fact exclusive. In preparation for further states, this commit consolidates them into one enum. This has the added benefit on not relying on the compiler's boolean packing anymore; we definitely now only need one byte for the bucket state.	2022-03-31 12:06:13 +02:00
Daniel Bertalan	e3eb68dd58	AK+Kernel: Avoid double memory clearing of HashTable buckets Since the allocated memory is going to be zeroed immediately anyway, let's avoid redundantly scrubbing it with MALLOC_SCRUB_BYTE just before that. The latest versions of gcc and Clang can automatically do this malloc + memset -> calloc optimization, but I've seen a couple of places where it failed to be done. This commit also adds a naive kcalloc function to the kernel that doesn't (yet) eliminate the redundancy like the userland does.	2022-03-15 11:56:46 +01:00
Andreas Kling	9d8da1697e	AK: Automatically shrink HashTable when removing entries If the utilization of a HashTable (size vs capacity) goes below 20%, we'll now shrink the table down to capacity = (size * 2). This fixes an issue where tables would grow infinitely when inserting and removing keys repeatedly. Basically, we would accumulate deleted buckets with nothing reclaiming them, and eventually deciding that we needed to grow the table (because we grow if used+deleted > limit!) I found this because HashTable iteration was taking a suspicious amount of time in Core::EventLoop::get_next_timer_expiration(). Turns out the timer table kept growing in capacity over time. That made iteration slower and slower since HashTable iterators visit every bucket.	2022-03-07 00:08:22 +01:00
Andreas Kling	eb829924da	AK: Remove return value from HashTable::remove() and HashMap::remove() This was only used by remove_all_matching(), where it's no longer used.	2022-03-07 00:08:22 +01:00
Andreas Kling	623bdd8b6a	AK: Simplify HashTable::remove_all_matching() Just walk the table from start to finish, deleting buckets as we go. This removes the need for remove() to return an iterator, which is preventing me from implementing hash table auto-shrinking.	2022-03-07 00:08:22 +01:00
Idan Horowitz	9b0d90a71d	AK: Support using custom comparison operations for hash compatible keys	2022-01-29 23:01:23 +02:00
James Puleo	10b25d2a57	AK: Implement `HashTable::try_ensure_capacity`, as used in `HashMap` This was used in `HashMap::try_ensure_capacity`, but was missing from `HashTable`s implementation. No one had used `HashMap::try_ensure_capacity` before so it went unnoticed!	2022-01-25 09:17:22 +01:00
Andreas Kling	5279a04c78	AK: Make Hash{Map,Table}::remove_all_matching() return removal success These functions now return whether one or more entries were removed.	2022-01-05 18:57:14 +01:00
Andreas Kling	54cf42fac1	AK: Add HashTable::remove_all_matching(predicate) This removes all matching entries from a table in a single pass.	2022-01-05 18:57:14 +01:00
Hendiadyoin1	c673b7220a	AK: Enable fast path for removal by hash-compatible key in HashMap/Table	2021-12-15 23:35:14 -08:00
Hendiadyoin1	d50360f5dd	AK: Allow hash-compatible key types in Hash[Table\|Map] lookup This will allow us to avoid some potentially expensive type conversion during lookup, like form String to StringView, which would allocate memory otherwise.	2021-12-15 13:09:49 +03:30
Andrew Kaster	762b92c650	AK: Resolve clang-tidy readability-qualified-auto warnings ... In files included by Kernel/Process.cpp and Kernel/Thread.cpp	2021-11-14 22:52:35 +01:00
Andrew Kaster	22feb9d47b	AK: Resolve clang-tidy readability-bool-conversion warnings ... In files included by Kernel/Process.cpp and Kernel/Thread.cpp	2021-11-14 22:52:35 +01:00
Hendiadyoin1	f76241914c	AK: Allow to clear HashTables/Maps with capacity	2021-11-11 09:19:17 +01:00
Andreas Kling	9d1f238450	AK: Make HashTable and HashMap try_* functions return ErrorOr<T> This allows us to use TRY() and MUST() with them.	2021-11-11 01:27:46 +01:00
Ben Wiederhake	f8d7b4daea	AK: Add missing headers Example failure: IDAllocator.h only pulls in AK/Hashtable.h, so any compilation unit that includes AK/IDAllocator.h without including AK/Traits.h before it used to be doomed to fail with the cryptic error message "In instantiation of 'AK::HashTable<T, TraitsForT, IsOrdered>::Iterator AK::HashTable<T, TraitsForT, IsOrdered>::find(const T&) [with T = int; TraitsForT = AK::Traits: incomplete type 'AK::Traits<int>' used in nested name specifier".	2021-10-06 23:52:40 +01:00
Hendiadyoin1	93cf01ad7d	AK: Mark HashTable::size_in_bytes() as constexpr	2021-09-10 14:33:53 +00:00
Hediadyoin1	1aa527f5b6	AK: Add OOM safe interface to HashTable/Map This adds a new HashSetResult only returned by try_set, to signal allocation failure during setting.	2021-09-10 14:33:53 +00:00
Andreas Kling	6ad427993a	Everywhere: Behaviour => Behavior	2021-09-07 13:53:14 +02:00
Andreas Kling	a940a8bf37	AK: Remove unused private HashTable::lookup_for_reading()	2021-07-21 18:18:51 +02:00
Andreas Kling	f65b039c44	AK: Sprinkle [[nodiscard]] on HashMap and HashTable	2021-07-21 18:18:29 +02:00
ngc6302h	213e2af281	HashTable: Rename finders with a more accurate and self-descripting name	2021-07-13 17:31:00 +02:00
Andreas Kling	3aabace9f5	AK: Use kfree_sized() in AK::HashTable	2021-07-11 14:14:51 +02:00
Hediadyoin1	4a81c79909	AK: Add Ordering support to HashTable and HashMap Adds a IsOrdered flag to Hashtable and HashMap, which allows iteration in insertion order	2021-06-15 22:16:55 +02:00
Idan Horowitz	71c54198fa	AK: Allow changing the HashTable behaviour for sets on existing entries Specifically, replacing the existing entry or just keeping it and canceling the set.	2021-06-09 11:48:04 +01:00
Andreas Kling	c584421592	AK: Make HashTable::operator=(HashTable&&) clear the moved-from table This is consistent with how other AK containers behave when moved from.	2021-05-30 14:34:32 +02:00
Gunnar Beutner	f89e8fb71a	AK+LibC: Implement malloc_good_size() and use it for Vector/HashTable This implements the macOS API malloc_good_size() which returns the true allocation size for a given requested allocation size. This allows us to make use of all the available memory in a malloc chunk. For example, for a malloc request of 35 bytes our malloc would internally use a chunk of size 64, however the remaining 29 bytes would be unused. Knowing the true allocation size allows us to request more usable memory that would otherwise be wasted and make that available for Vector, HashTable and potentially other callers in the future.	2021-05-15 16:30:14 +02:00
Brian Gianforcaro	1682f0b760	Everything: Move to SPDX license identifiers in all files. SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *	2021-04-22 11:22:27 +02:00
Brian Gianforcaro	0593fa4dcb	AK: Annotate HashTable functions as [[nodiscard]]	2021-04-11 12:50:33 +02:00
Brian Gianforcaro	15b94ec1fd	AK: Make HashTable with capacity constructor explicit	2021-04-11 12:50:33 +02:00
thislooksfun	e55b8712d4	AK: Inline HashTable writing bucket lookup The old approach was more complex and also had a very bad edge case with lots of collisions. This approach eliminates that possiblility. It also makes both reading and writing lookups a little bit faster.	2021-04-02 12:54:54 +02:00
thislooksfun	509eb10df4	AK: Inline the bucket index calculation The result of the modulo is only used in the array index, so why make the code more complex by calculating it in two different places?	2021-04-02 12:54:54 +02:00
Andreas Kling	ef1e5db1d0	Everywhere: Remove klog(), dbg() and purge all LogStream usage :^) Good-bye LogStream. Long live AK::Format!	2021-03-12 17:29:37 +01:00
Andreas Kling	5d180d1f99	Everywhere: Rename ASSERT => VERIFY (...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED) Since all of these checks are done in release builds as well, let's rename them to VERIFY to prevent confusion, as everyone is used to assertions being compiled out in release. We can introduce a new ASSERT macro that is specifically for debug checks, but I'm doing this wholesale conversion first since we've accumulated thousands of these already, and it's not immediately obvious which ones are suitable for ASSERT.	2021-02-23 20:56:54 +01:00
Lenny Maiorani	537bedbf38	HashTable: Correctly pass args to set Problem: - Using regular functions rather than function templates results in the arguments not being deduced. This then requires the same function to be written multiple times and for `move` to be used rather than `forward`. Solution: - Collapse multiple function overloads to a single function template with a deduced argument. This allows the argument to be a forwarding reference and bind to either an l-value or r-value and forward the value.	2021-01-31 10:48:12 +01:00
Lenny Maiorani	e6f907a155	AK: Simplify constructors and conversions from nullptr_t Problem: - Many constructors are defined as `{}` rather than using the ` = default` compiler-provided constructor. - Some types provide an implicit conversion operator from `nullptr_t` instead of requiring the caller to default construct. This violates the C++ Core Guidelines suggestion to declare single-argument constructors explicit (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit). Solution: - Change default constructors to use the compiler-provided default constructor. - Remove implicit conversion operators from `nullptr_t` and change usage to enforce type consistency without conversion.	2021-01-12 09:11:45 +01:00
Dano Perniš	3efd4c105f	AK: Reduce memory writes in HashTable destructor	2020-10-18 14:44:23 +02:00
Dano Perniš	d30c559774	AK: Implement HashTable assignment in terms of swap	2020-10-18 14:44:23 +02:00
Dano Perniš	7f3f63dd92	AK: Provide swap() for HashTable	2020-10-18 14:44:23 +02:00
Andreas Kling	7ad8bb5be6	AK: Tune HashTable load factor Double the capacity when used+deleted buckets crosses 60% of capacity. This appears to be a sweet spot for performance based on some ad-hoc testing with test-js. :^)	2020-10-16 08:47:10 +02:00
Andreas Kling	4e50079f36	AK: Redesign HashTable to use closed hashing Instead of each hash bucket being a SinglyLinkedList, switch to using closed hashing (open addressing). Buckets are chained together via double hashing (hashing the hash until we find an unused bucket.) This greatly reduces malloc traffic, since each added element no longer allocates a new linked list node. Appears performance neutral on test-js. Can definitely be tuned and could use proper management of load factor, etc.	2020-10-15 23:49:53 +02:00
Muhammad Zahalqa	a68650a7b4	AK: HashTable add a constructor that allows preallocation of capacity + Use in CppLexer. (#3147 ) 1. Add general utility to get array number of elements. 2. Add Needed API to AK::HashTable 3. Refactor CppLexer initialization	2020-08-16 11:04:00 +02:00
Tom	dadd53e4f2	AK: HashTable/HashMap return whether action was performed for set/remove This allows performing an action based on whether something was actually added or removed without having to look it up prior to calling set() or remove().	2020-07-09 21:58:07 +02:00
William McPherson	121e7306c3	AK: Expose SinglyLinkedListIterator constructor This commit replaces SinglyLinkedListIterator::universal_end() with an empty SinglyLinkedListIterator(). Piano needs this in order to initialize a member array of iterators without 84 lines of universal_end().	2020-02-27 10:21:13 +01:00

1 2

81 commits