Commit graph

12 commits

Author SHA1 Message Date
Mikkel Denker
1d821ef4db rustup update and fix clippy warnings 2024-11-27 17:15:24 +01:00
Mikkel Denker
4915160449 harmonic centrality nearest neighbor calculation that uses the harmonic centrality of the highest neighbors node as a seed node proxy for the centrality of that node (with a discount factor) 2024-10-02 15:57:30 +02:00
Mikkel Denker
4e8c165a1c
cleanup temporary directories automatically in tests (#228) 2024-10-01 09:42:14 +02:00
Mikkel Denker
308388262f store node ids as big endian in webgraph so sort is correct during merge 2024-08-13 13:42:35 +02:00
Mikkel Denker
817bda9738 optionally merge all webgraph segments into a single segment for improved read performance 2024-06-09 14:48:11 +02:00
Mikkel Denker
3c4a0c480e use remote webgraph in crawl planner 2024-06-05 09:41:03 +02:00
Mikkel Denker
38416c6070 web-spell cleanup old dicts after merge 2024-05-19 14:01:15 +02:00
Mikkel Denker
7e8781fe5b use binary heap for less cmp when merging speedy-kv segments 2024-05-13 10:44:17 +02:00
Mikkel Denker
19dab37daa update segment paths to new folder during move 2024-05-11 11:29:26 +02:00
Mikkel Denker
9c983e5f96
Top k webgraph edges (#197)
* implement random access index in file_store where keys are u64 and values are serialised to a constant size

* cleanup: move all webgraph store writes into store_writer

* add a 'ConstIterableStore' that can store items on disk without needing to interleave headers in the case that all items can be serialized to a constant number of bytes known up front

* change edges file format to make edges for a given node iterable.
this allows us to only load a subset of the edges for a node in the future

* compress webgraph labels in blocks of 128

* ability to limit number of edges returned by webgraph

* sort edges in webgraph store by the host rank of the opposite node
2024-05-03 09:33:57 +02:00
Mikkel Denker
da4f930b03 cratify file-store 2024-04-22 21:30:59 +02:00
Oliver Bøving
18d9d279fb
Cratify bloom and speedy-kv (#193)
* Move bloom into separate crate

* Move speedy_kv into a separate crate

* add licenses

---------

Co-authored-by: Mikkel Denker <mikkel@stract.com>
2024-04-22 21:18:44 +02:00