0ct0pu5/search-engine-stract

Author	SHA1	Message	Date
Mikkel Denker	84f56053a1	rename all 'type' to '_type' in api as 'type' might be reserved in some languages also optionally return structured data from api	2024-05-03 17:03:56 +02:00
Mikkel Denker	ec8b9a0786	Use remote webgraph instead of local (#196 ) * [WIP] remote webgraph client * [WIP] use remote webgraph for backlinks during indexing. still need to properly batch the requests * support batch requests in sonic * [WIP] use remote webgraph in explore and make sure ranking pipeline always sets updated score * use remote webgraph for inbound similarity * return correct type from explore api	2024-05-01 09:04:04 +02:00
Mikkel Denker	a6598b8169	add nlnet funding	2024-04-26 09:17:03 +02:00
Mikkel Denker	acf9d11d1d	hide optics selector when js is disabled	2024-04-22 19:44:16 +02:00
Mikkel Denker	4951b72417	add truncated body in api	2024-04-18 15:21:28 +02:00
Wesley Appler	cc612c5b8d	[WIP] Implemented `keybind` module to handle keyboard shortcuts (#186 ) * Implemented `keybind` module to handle keyboard shortcuts * Removal of direct DOM querying and the addition of searchbar keybindings * Remove generics from 'keybind' It's always used with 'Refs' as context, so there is no need to have it generic * Revert 'Searchbar' to use simple keydown match instead of 'Keybind' The functionality didn't work (for instance enter didn't trigger a search). It would require a lot of aditional complexity in 'Keybind' to also support the use case from searchbar. It's okay to have some code duplication if this results in a simpler solution that will therefore be more readable and mantainable long term * Remove need to know about keyboard event in keybind callbacks This forces us to not rely on direct manipulation of the event, but instead implement the necesarry functionality in helper methods in the different components * forgot to remove a console.log... --------- Co-authored-by: Mikkel Denker <mikkel@stract.com>	2024-04-02 13:27:13 +02:00
Mikkel Denker	110f4cdffd	disallow bots from /search in robots.txt	2024-03-22 11:05:44 +01:00
Mikkel Denker	065063861d	only perform server side search when ssr parameter is set, otherwise search client side we now redirect clients with js disabled using a <noscript><meta ...</noscript> tag. this setup allows us to show an empty serp before the search is finished if one has enabled js. it also makes it posible to add more customization options later like applying optics directly from the search string. otherwise, the server would have no way of knowing what the optic rules for a particular optic name is.	2024-03-19 12:34:44 +01:00
Mikkel Denker	3ec66c134c	fix CI. remove unused imports	2024-03-13 09:55:21 +01:00
Mikkel Denker	5ce97abf46	Run frontend lint in CI (#180 ) Adds `npm run lint` to CI and fixes all the previous lint errors.	2024-03-13 09:48:07 +01:00
Wesley Appler	7bca42f068	Created ResultLink to handle the opening of result links in new tabs (#178 ) * Created ResultLink to handle the opening of result links in new tabs * Removed period * Removed ResultLink from thesaurus & removed noreferrer	2024-03-13 09:18:02 +01:00
Mikkel Denker	c8447c9ef0	make explore work without javascript and add browser features section to privacy statement	2024-03-10 17:24:04 +01:00
Mikkel Denker	26eb164482	refactor snippet into a normal snippet and a rich snippet. a normal snippet should always be returned from the api. an application can then choose to show the rich snippet instead if one is present. this gives more flexibility when building applications on top of stracts api	2024-03-01 14:28:49 +01:00
Mikkel Denker	c9f48ca3b3	gracefully handle summarization errors	2024-03-01 13:26:18 +01:00
Mikkel Denker	7d870b2702	build wasm during 'just setup' and make sure pkg has a package.json file. see https://github.com/rustwasm/wasm-pack/issues/965	2024-02-29 14:05:02 +01:00
Mikkel Denker	9e45aa95fd	make sure sites aren't removed when importing an optic	2024-02-29 12:35:31 +01:00
Mikkel Denker	d9b7328a67	like/dislike/block button tooltips	2024-02-28 20:12:17 +01:00
Wesley Appler	25c0344578	[WIP] Implement the importing of optics (#167 ) * Initial implementation of importing sites from an optic * Removed unused import * Updated button text * Implemented client-side WASM to allow for parsing of imported .optic files * Removed unneeded deps & updated `CONTRIBUTING.md` to reflect wasm-pack needs * CI updates * Added vite-plugin-wasm-pack to ensure wasm modules get copied over * CI fix >:( * More CI attempts * agony - CSP fix & further wasm-pack fixes * CSP updates * Package update to prevent an unneccesary build of wasm * reduce bloat in ci build log from wasm * fix another non-determinsticly failing test * only install wasm-pack as part of setup steps in CONTRIBUTING.md ./scripts/ci/check seems to fail if it tries to install wasm-pack while it is already installed (at least on my machine). as it is already added as a step in CONTRIBUTING.md we can assume it has been installed on the system * add vite plugin to ensure changes to 'crates/client-wasm' gets reflected in the frontend. adapted from https://github.com/StractOrg/stract/pull/109 * run 'npm run format' * propagate errors from wasm crate	2024-02-28 17:01:32 +01:00
Mikkel Denker	6b9d514a5b	temporarily disable frontend type check in CI	2024-02-28 11:34:19 +01:00
Mikkel Denker	d30cb51e3d	Update nodejs to v20.10 in CI (#166 ) * update nodejs to v20.10 in github action * loosen restriction on npm version frontend should work as long as we have correct nodejs version. don't think the npm version is necessary	2024-02-22 19:07:41 +01:00
Wesley Appler	1260ba0969	Add node versioning to `package.json` (#165 ) * Updated package.json with node versions & added nvmrc * Small edit to update git email * Small edit to update git email --------- Co-authored-by: Wes Appler <wes@lamemakes>	2024-02-22 18:40:26 +01:00
Oliver Bøving	f5384a4537	Run prettier and fix some fontend lint errors (#161 ) * Run `npm run format` * Fix some of the eslint errors	2024-02-19 10:05:09 +01:00
Abdurrahman Rajab	20211d8e25	fix: update scrolls to auto #159 (#160 )	2024-02-19 09:29:28 +01:00
Oliver Bøving	2d8973bcf7	Add basic CI (#156 ) * Add basic CI * Add liburing installation step to CI workflow * Run `npm install` as part of ci/check * Add `@types/node` package * Add `submodules: 'recursive'` to CI * Skip test if test data is not available * Install `cargo-about` in CI	2024-02-17 20:09:58 +01:00
Mikkel Denker	3fa73e42e4	dynamically import highlight-js. makes sure we don't send the somewhat big library to the frontend unless we actually need to render code.	2024-02-12 14:30:01 +01:00
Mikkel Denker	aa4e59cb6e	Fix CORS issues when adding an optic. When adding an optic, we first make sure that we can actually fetch the optic. This check was performed client-side before, which would cause some CORS errors if it was against the CORS policy of the server hosting the optic. This commit introduces a simple endpoint on our frontend server, so the request to the optic server now doesn't come directly from the client.	2024-02-11 13:06:53 +01:00
Mikkel Denker	fd33eb4b66	fixed bug where search suggestions could not be closed in safari	2024-02-05 15:27:05 +01:00
Mikkel Denker	f61f1f6b0f	fix bug where query suggestions couldn't be selected in safari	2024-02-05 14:57:26 +01:00
Mikkel Denker	099282cefa	don't need to send all ranking signals to frontend for dicussions widget. sending final score is enough	2024-02-03 16:48:43 +01:00
Mikkel Denker	ea3b7a4099	implement some layers in ggml linear, embedding and multihead attention	2024-01-31 17:51:02 +01:00
Crispy	ecdc4f89cf	fix outdated link in settings/privacy (#123 )	2024-01-30 13:33:28 +01:00
Mikkel Denker	e8732d7877	annotate summary box with aria-busy to indicate that screen readers might want to wait until summary is done generating	2024-01-29 15:46:11 +01:00
Mikkel Denker	c34849cae9	move spellcheck into separate api endpoint and only correct non-special terms. we don't ant to correct "site:...", "inurl:..." etc. parts of the query.	2024-01-29 14:07:34 +01:00
Mikkel Denker	c91cf3d3d3	Modal with long domains moved “do you like results from ...” too far to the right	2024-01-28 15:01:24 +01:00
Mikkel Denker	a4a3a8d1ed	don't show discussions widget if there is a user applied optic	2024-01-28 13:52:55 +01:00
Mikkel Denker	dca9f039df	consistent discussions drop down arrow size	2024-01-28 13:42:38 +01:00
Mikkel Denker	cea9884395	make sure each result uses full width of result div	2024-01-28 13:31:49 +01:00
Mikkel Denker	14e5466fb8	Move widget, discussions etc out from api searcher and into separate api endpoints. This creates a better separation between the frontend and backend, and avoids e.g. 'fetch_discussions' abstraction leak. It also allows for these endpoints to be used in different contexts separately without having to perform a full search.	2024-01-23 12:21:55 +01:00
Mikkel Denker	eab46122cb	[serp] if there aren't many results for a search on mobile, the results would be shown far down on the page. this fixes the css-grid so the first result is placed at the top	2024-01-19 13:55:54 +01:00
Mikkel Denker	fbc01ad865	summarization using mistral and 'chain-of-density' approach. the summarization becomes much better if we allow the model to first generate a candidate summarization and then improving on it. doing the improvement step just once seems to significantly improve the summary. we also now use an llm (mistral 7b) for the summarisations, as we can then use the same model for multiple tasks and serve it using gpus, thus significantly decreasing the latency.	2024-01-19 11:08:17 +01:00
Mikkel Denker	6010375425	add a 'fetchSidebar' api parameter that defaults to false. this will speedup api searches that does not need sidebar response	2024-01-15 11:59:34 +01:00
Mikkel Denker	ee1db747f6	[ranking] dampen scores of ngram fields if a larger ngram already matched for the document. for instance, if a document matches on 'example text' in the title, it should count 'example' and 'text' bm25 scores less for that particular document. this allows us to have better control over the scoring and not have text signals dominate the scores when a trigram matches a document	2024-01-15 10:53:38 +01:00
Mikkel Denker	45bec0f942	change paywall icon to text and add settings page to choose when results should be marked	2024-01-02 15:45:09 +01:00
Mikkel Denker	e71570e7f7	minor frontend improvements mobile * [ios] discussion widget had two icons as <summary> list-none was not respected on safari. * [ios] added a viewport width to remove horizontal scrolling. * [ios] increased font size of searchbar to prevent ios from zooming when searchbar gets selected. * 'found x results' was ugly on mobile. now only shown on desktop * decreased font size of optic and language selector to not remove attention from results. hopefully they are still noticed. * added og:title and og:image to improve experience when stract is linked from somewhere.	2024-01-02 13:15:43 +01:00
Mikkel Denker	9fe747c9c9	Zim parser for entity sidebar (#120 ) * read raw .zim file format * iterate articles and images from zim file * construct entities for entity index using zimba * parse images from .zim file and store them in the entity index * add whitespace in entity info to separate <li> elements * zimba readme	2024-01-02 10:39:35 +01:00
Mikkel Denker	9e1bb23a96	use static env variables in frontend https://kit.svelte.dev/docs/migrating-to-sveltekit-2#dynamic-environment-variables-cannot-be-used-during-prerendering	2023-12-20 07:37:26 +01:00
Mikkel Denker	6af6085ebb	make sure doubling happens after min in crawl delay. this increases the politeness of the crawler even more, as we now e.g. wait minimum 10 seconds after first 429 response.	2023-12-19 14:29:00 +01:00
Mikkel Denker	92f933e5fe	possible fix to be able to open indexes that doesn't contain all the necesarry fields. this is usefull to be able to update the code continously in production and not having to wait for next indexing (#118 )	2023-12-19 13:30:31 +01:00
Oliver Bøving	e01654662c	Upgrade to SvelteKit 2 (#116 ) * Remove `.svelte-kit/tsconfig.json` This is part of .gitignore but somehow got through. * Migrate to SvelteKit 2 * Update frontend dependencies * fix lost divider and dynamic title on search page due to missing await --------- Co-authored-by: Mikkel Denker <mikkel@trystract.com>	2023-12-18 20:54:37 +01:00
Mikkel Denker	3bb113f87c	style color of select elements so they look good on windows	2023-12-16 17:29:11 +01:00

1 2 3 4 5 ...

277 commits