Mikkel Denker
|
5dfeafcb0f
|
faster optics
|
2023-03-21 21:48:43 +01:00 |
|
Mikkel Denker
|
bdd6bc0674
|
actually load highlightjs when needed
|
2023-03-21 15:06:40 +01:00 |
|
Mikkel Denker
|
ea231fc780
|
webgraph didn't properly merge segments. The new segment paths didn't line up with where the segment should actually be stored. This should now be fixed
|
2023-03-21 11:57:44 +01:00 |
|
Mikkel Denker
|
1f64c14a22
|
forgot to remove dbg
|
2023-03-20 16:18:43 +01:00 |
|
Mikkel Denker
|
687db2d8b0
|
only merge webgraph segments in the end
|
2023-03-20 15:36:48 +01:00 |
|
Mikkel Denker
|
2aa6b458c1
|
improve webgraph segment merge speed
|
2023-03-20 15:20:06 +01:00 |
|
Mikkel Denker
|
8277a0521c
|
sometimes you just gotta say fuck async
|
2023-03-20 14:31:44 +01:00 |
|
Mikkel Denker
|
3a5b573cc4
|
use futures executor for single job instead of tokio since we got an error when trying to build the webgraph. It looked like rayon tried to spawn multiple tokio runtimes on the same thread due to the threadpool
|
2023-03-20 13:49:03 +01:00 |
|
Mikkel Denker
|
b003d21c4f
|
hopefully improve webgraph merges by merging in parallel
|
2023-03-20 12:58:47 +01:00 |
|
Mikkel Denker
|
816b9660f1
|
better responsiveness mobile
|
2023-03-20 11:58:19 +01:00 |
|
Mikkel Denker
|
bec41d4f8b
|
fixed new search button borders
|
2023-03-20 09:53:45 +01:00 |
|
Mikkel Denker
|
ad984251b4
|
new ui design
|
2023-03-20 09:47:36 +01:00 |
|
Mikkel Denker
|
913c5502c3
|
let number of webgraph segments depend on number of cores
|
2023-03-17 17:40:42 +01:00 |
|
Mikkel Denker
|
20f822dde4
|
Stackoverflow snippet box text cutoff
|
2023-03-17 17:26:23 +01:00 |
|
Mikkel Denker
|
2cd5e6568b
|
a bunch of frontend quirks. Also made frontend lighter by loading less static files unless they are needed
|
2023-03-17 16:17:10 +01:00 |
|
Mikkel Denker
|
d6cb9cb316
|
simplify webgraph merge logic a little
|
2023-03-15 17:14:34 +01:00 |
|
Mikkel Denker
|
29ee8edfeb
|
forgot to remove old index after merge
|
2023-03-15 15:29:25 +01:00 |
|
Mikkel Denker
|
2b36327fa1
|
way faster similarity index creation. Also improves inverted index merging by dividing the merge into num_cpu merges where each merge merges into a single segment
|
2023-03-15 14:20:57 +01:00 |
|
Mikkel Denker
|
0cc4cac100
|
memory mapped graph store
|
2023-03-14 15:46:53 +01:00 |
|
Mikkel Denker
|
80ac0bd4bc
|
return ranking signals from api
|
2023-03-12 17:38:24 +01:00 |
|
Mikkel Denker
|
ba6cdd4da1
|
boost final score using optics, not just tantivy bm25
|
2023-03-12 16:25:17 +01:00 |
|
Mikkel Denker
|
44f61b98ce
|
prepare for ltr models
|
2023-03-11 18:50:46 +01:00 |
|
Mikkel Denker
|
f74fe6934c
|
Rename to stract
|
2023-03-06 09:43:54 +01:00 |
|
Mikkel Denker
|
2151fcfca8
|
chitchat dynamic cluster membership
|
2023-03-01 10:24:25 +01:00 |
|
Mikkel Denker
|
9ae23fc7a6
|
Log number of failed searches separately from number of successful searches
|
2023-02-28 11:38:53 +01:00 |
|
Mikkel Denker
|
a8dfdf5df8
|
Hours are rounded, not stripped
|
2023-02-28 10:39:39 +01:00 |
|
Mikkel Denker
|
a24715d4d3
|
Update stored query link
|
2023-02-28 10:38:12 +01:00 |
|
Mikkel Denker
|
8f021da1f2
|
Update stored query link
|
2023-02-28 10:37:41 +01:00 |
|
Mikkel Denker
|
2714d9fcc3
|
Usage statistics
|
2023-02-28 10:35:52 +01:00 |
|
Mikkel Denker
|
b91d655b2f
|
change configure command to use new object store
|
2023-02-26 20:25:53 +01:00 |
|
Mikkel Denker
|
53d72eb2f2
|
Export number of search requests as a prometheus metric
|
2023-02-21 16:22:24 +01:00 |
|
Mikkel Denker
|
a2fcf02218
|
Make index smaller by not storing positions for unecesarry fields
|
2023-02-21 10:34:17 +01:00 |
|
Mikkel Denker
|
74c9d1a133
|
Use site rankings for discussion widget
|
2023-02-20 16:42:43 +01:00 |
|
Mikkel Denker
|
cdc2cd2a6f
|
Make next page arrow in-active when there are no next results
|
2023-02-20 16:35:13 +01:00 |
|
Mikkel Denker
|
6abf22abad
|
Fix '’' not showing bug in summarizer
|
2023-02-20 16:22:03 +01:00 |
|
Mikkel Denker
|
650e4b6201
|
Ability to turn off QA model in config
|
2023-02-20 15:45:49 +01:00 |
|
Mikkel Denker
|
6d44ec0556
|
Prefer snippets that don't break sentences
|
2023-02-20 15:31:21 +01:00 |
|
Mikkel Denker
|
e04b5069b9
|
removed fastfield cache and introduced a fastfield reader instead
|
2023-02-18 15:45:03 +01:00 |
|
Mikkel Denker
|
a4c142f5f3
|
Merge index segments into num_segments/2 to only merge segments once every num_segments/2 index merges
|
2023-02-17 14:28:07 +01:00 |
|
Mikkel Denker
|
4947fccdfe
|
Adjust just-text default parameters since we now have the option to only index webpages with clean text
|
2023-02-17 14:12:30 +01:00 |
|
Mikkel Denker
|
20d95f5104
|
Summarization feature somewhat finished
|
2023-02-14 14:29:28 +01:00 |
|
Mikkel Denker
|
e6b9cdce5f
|
use beginning of text as extractive summary if query-specific summary fails
|
2023-02-14 13:51:07 +01:00 |
|
Mikkel Denker
|
963052417c
|
cleanup overlapping passages from extractive summary
|
2023-02-14 13:33:44 +01:00 |
|
Mikkel Denker
|
38fa0762a9
|
summarization button and stream summary to frontend
|
2023-02-14 11:06:26 +01:00 |
|
Mikkel Denker
|
0309386dfa
|
Stream output from summarization in iterator
|
2023-02-11 11:44:17 +01:00 |
|
Mikkel Denker
|
7d78b2ecde
|
summarization model length norm
|
2023-02-07 20:16:52 +01:00 |
|
Mikkel Denker
|
2a1fa6109a
|
abstractive summarization model with beam search
|
2023-02-07 15:11:23 +01:00 |
|
Mikkel Denker
|
785e610db8
|
Unravel recursive functions to avoid stackoverflow. Summarizer is also WIP (there might be some 'never used' warnings when compiling)
|
2023-02-03 15:57:39 +01:00 |
|
Mikkel Denker
|
a53fd68085
|
Tokenizer index out of bounds
|
2023-02-01 14:55:03 +01:00 |
|
Mikkel Denker
|
8b9753d4a0
|
Optionally skip some warc files when indexing
|
2023-02-01 13:06:26 +01:00 |
|