0ct0pu5/search-engine-stract

Author	SHA1	Message	Date
Mikkel Denker	54fe19ddf6	trystract.com -> stract.com	2023-12-16 14:43:00 +01:00
Mikkel Denker	13f58064d0	fix inline code formatting in stackoverflow widget	2023-12-15 13:00:48 +01:00
Mikkel Denker	5bcad4d681	stackoverflow inline code blocks	2023-12-13 15:59:15 +01:00
Mikkel Denker	b30310cc3a	change host graph back to better be able to rank 'popular.domain.com' differently from 'not-as-popular.domain.com'	2023-12-12 22:02:47 +01:00
Mikkel Denker	4eb8ef3892	Precalculate slow inbound similarities (#114 ) * pre-calculate inbound similarities * only pre-calculate similarities for nodes that have many inbound links * tweak dark theme * parallel inbound pre-calculation	2023-12-12 18:26:25 +01:00
Mikkel Denker	03a3dd6333	[frontend] smaller font-size some places	2023-12-10 14:46:06 +01:00
Mikkel Denker	4929f123e3	only show one definition in thesaurus widget with option to expand	2023-12-09 18:00:29 +01:00
Mikkel Denker	30de42675d	rename similar_sites -> similar_hosts	2023-12-09 14:50:48 +01:00
Mikkel Denker	c2a82eaa14	fediverse optic	2023-12-08 16:14:15 +01:00
Mikkel Denker	9dce1a3931	[api-searcher] refactor functionality into helper structs to keep the searcher a bit more clean	2023-12-08 15:54:04 +01:00
Mikkel Denker	40e02d7df3	Fix logo height=0 in safari	2023-12-05 10:32:33 +01:00
Mikkel Denker	3d56e3120a	Make icon that warns about high likelihood of ads on page a bit smaller	2023-12-04 13:31:22 +01:00
Mikkel Denker	4c8973eb1a	update robots.txt	2023-12-04 08:14:39 +01:00
Mikkel Denker	00107b23e2	add discord and matrix link to about page	2023-12-03 16:21:59 +01:00
Mikkel Denker	a73f100c78	re-enable discussions widget	2023-12-03 12:05:30 +01:00
Mikkel Denker	f30e9131f9	New spell corrector training. I have just now gotten to the point where we can test this, so it definitely doesn't work yet but must o f the parts should be there.	2023-11-30 15:47:56 +01:00
Mikkel Denker	97f2f37eb7	new logo text	2023-11-23 10:08:51 +01:00
Mikkel Denker	03178a4185	tone down dark theme a bit	2023-11-23 09:37:26 +01:00
Mikkel Denker	ff2c0ada9f	mark pages which likely contains ads or paywall in the serp	2023-11-18 22:19:41 +01:00
Mikkel Denker	63118f1a22	way faster bitvec similarity calculations by approximating intersection for sites with many inbound links with hyperloglog	2023-11-16 18:45:38 +01:00
Mikkel Denker	3054fe5c6a	Fix crawl plan. Too many spam urls were being prioritized	2023-11-14 14:07:17 +01:00
Mikkel Denker	0b1ee0ec52	host graph is now based on root domain	2023-11-08 15:56:58 +01:00
Mikkel Denker	a6ea0d734c	make live index searchable	2023-11-05 14:18:26 +01:00
Mikkel Denker	34572311d4	Make result likes host-level. liking en.wikipedia.org is not the same as liking wikipedia.org	2023-10-26 10:48:18 +02:00
Mikkel Denker	a41160124e	feed scheduler to prepare for live index	2023-10-24 20:35:49 +02:00
Mikkel Denker	6b1a59f740	matrix server	2023-10-22 18:29:04 +02:00
Mikkel Denker	3b52c67ec7	prevent double search during hydration	2023-10-22 13:42:34 +02:00
Mikkel Denker	7dd81c131b	tilde optic	2023-10-22 11:35:43 +02:00
Mikkel Denker	14543fd91c	fix screen width on mobile	2023-10-16 15:19:44 +02:00
Mikkel Denker	38342798e1	A bunch of performance optimizations during search. * use image link instead of base64 embedded to lower serp size. * embed 'bm25' signal into all the 'bm25_' signals. The math should be equivalent and prevents us calculating the same signals twice. remove proximity ranking signals as they were disabled. Need to figure out how to make them faster. * faster snippet generation by first trying to generate snippet without stemming. This will succeed in most results and is vastly faster.	2023-10-15 11:49:44 +02:00
Mikkel Denker	276165da49	move libtorch behind feature flag	2023-10-14 14:17:54 +02:00
Mikkel Denker	a1f5a18628	fix /webmasters formatting	2023-10-11 16:53:00 +02:00
Mikkel Denker	035339109b	Make result counts optional. This allows us to short circuit the query when we have reached max_docs_considered at the query level instead of at the collector level. This should increase search performance since the searcher then doesn't have to iterate all the results in order to return a count. In practice we probably always want to count the results when calling the api from the frontend, but it's a nice performance parameter we can flip if we need to squeeze every drop of performance from the backend at some point. We also only count the results for api requests that needs the count.	2023-10-10 15:48:15 +02:00
Oliver Bøving	a2a3d7d97d	Generate frontend API using abeye (#105 ) * Generate frontend API using abeye * Update CONTRIBUTING.md to include links for deps and section for abeye	2023-10-02 10:35:19 +00:00
Mikkel Denker	1eea731d26	Thesaurus widget and minor bugfixes to calculator. The thesaurus widget is powered by a openwordnet RDF format. This should make it very easy to add more languages when we want to. Phrase searches triggered a string function in fend which caused the calculator to show up. The y-combinator function caused a stack overflow due to the recursion. Both of these bugs have now been fixed.	2023-10-01 18:07:48 +02:00
Oliver Bøving	661aae3c11	Do '' -> '•' on the frontend (#108 ) The old version of `EntityIndex::best_info` manipulated the text of spans without alterting the associated link offsets. To do this would be convoluted when having to consider '•', so instead we do not replace '' with '•' any longer, but only strip the prefix of '' and whitespace and then subtract the removed prefix length from all link offsets. When rendering the snippets on the frontend, we perform the '' replacement since the index have been made into snippets and the text is thus free to change.	2023-09-29 10:44:55 +00:00
Mikkel Denker	cb49f611fd	currency conversion in calculator	2023-09-28 10:12:36 +02:00
Oliver Bøving	2d990093eb	Frontend chores (#104 ) * Run `npm run format` * Remove two instances of @ts-ignore by defining the locals in app.d.ts * Align icons on Manage Optics page The eye was centered vertically, while the minus was aligned to the top. This puts them both at the top.	2023-09-25 12:33:28 +00:00
Oliver Bøving	383fc82640	Add theming specific styling to prose (#103 )	2023-09-25 12:32:33 +00:00
Mikkel Denker	23a6fc7cd2	toggle visibility of optics	2023-09-23 12:45:20 +02:00
Mikkel Denker	96ea5530f8	Setting to send search requests with POST instead of GET	2023-09-21 14:59:18 +02:00
Mikkel Denker	2b88af096e	added webgraph endpoints to api	2023-09-18 14:54:52 +02:00
Mikkel Denker	f7512a5595	make discussions widget optional in api	2023-09-15 15:44:59 +02:00
Mikkel Denker	9f10c70712	make searchbar border more visible on low res displays	2023-09-15 13:00:55 +02:00
Oliver Bøving	08fffff1a0	Change many icons for their _mini_ variant (#101 ) A lot of places where we use icons they are quite small, so it is preferable to use the mini aka 20x20 solid variants for readability. Most of the changed icons were already solid, but the following were changed from outline to solid: - Callout icons - Discussion chat bubble	2023-09-14 11:09:57 +00:00
Oliver Bøving	dfa4171c53	Refactor entity code and add start of Wiki template rendering (#75 ) * Refactor entity code This commit is an amalgamation of many changes of varying size (that should have been multiple commits, sorry): - `WikiNodeExt` for convenience methods on `Node`s - Make `Link` offsets byte-oriented rather than char-oriented - Make `Span::add_link` compute the link offsets internally - Add `Span::add_node` shared by the `From<[Node]>` and `EntityBuilder` methods - Refactor `EntityIterator` by using the `Deref` impl on `quick_xml::Event` - Move `EntityBuilder` extraction methods out to take a shared `&[Node]` instead of re-parsing each. - These extraction methods have been refactored to use iterators more heavly and wild-cards in matching - Add an, early-stage, `render_template` for rendering certain Wiki specific templates such as pronunciations - Refactor `maybe_prettify_entity_date` and `entity_link_to_html` - Add expect testing to `EntityBuilder` with a debug Wiki renderer * Use `quick_xml` `name()` in `EntityIterator` Turns out the `Deref` impl on `Event` gives all bytes and not just the name of the event. Instead we now extract the name so we can use it in the match arms. * Handle `respell` templates * Handle `Node::ParagraphBreak` during entity rendering When a `ParagraphBreak` is encountered a `\n` is inserted. This may or may not be appropriate for the place it's rendered. In HTML I think this will result in a since place, which would be good, but I haven't tested just yet. * Handle potentially empty `Node::Template` name * Document and move `render_template` to mod scope * Documnet `check_abstract` * Use a snippet structure for entity rendering instead of HTML This is an inprogress commit, that currently has bincode issues * Build 'DisplayedEntity' on the api server. This avoids the issues we have with tagged enums in bincode, while wanting camelCase naming for the public API. It also has the added side-benefit of keeping the internal API more clean as all 'Displayed' now gets constructed right before they hit the public facing endpoints. Fix entity info spacing between links * Change entity snippet links to underline on hover instead of color * Include fr-IPA wiki template --------- Co-authored-by: Mikkel Denker <mikkel@trystract.com>	2023-09-14 11:07:35 +00:00
Oliver Bøving	253bd252b1	Add multiple themes (#97 ) * Add multiple themes * Remove some unused tailwind colors * Theming for explore "Show more" chevron * Make `Select` component generic * Run format * Optimze frontpage and header logo * Unnest styling from `.code-sample` * Add very basic fallback error page * Refactor Select component It now uses an options array to better track the real values of the options instead of relying on stringed values. The new API is both more ergonomic and more correct! Closes #100 * Handle `undefined` in `writableLocalStorage` better When the inserted value is `undefined`, remove the value rather than inserting `undefined`, which caused subsequent `JSON.parse` to crash * Tweak the light themes * Move theme select into settings	2023-09-13 10:23:06 +00:00
Oliver Bøving	666cc00f5d	Factor out `TextSnippet` rendering to a component (#99 )	2023-09-12 17:11:30 +00:00
Oliver Bøving	7ad41aef22	Don't animate search results when the query changes (#98 ) Only animate when SR changes, such as when a site is blocked	2023-09-12 17:07:41 +00:00
Oliver Bøving	5bd6dc74b2	Add dark background do mobile drop down in header (#96 ) Closes #93	2023-09-11 13:13:50 +00:00

1 2 3 4 5 ...

277 commits