Commit graph

25 commits

Author SHA1 Message Date
Daoud Clarke
28b326aedf Fix broken JS 2023-11-07 18:59:38 +00:00
Daoud Clarke
8293a7afa4 Update query string 2023-11-05 21:45:13 +00:00
Daoud Clarke
36ec3ae4e5 Add database config 2023-10-26 17:32:46 +01:00
Daoud Clarke
bd017079d5 Add login using allauth 2023-10-24 10:32:06 +01:00
Daoud Clarke
1227ae33c8 Run poetry lock 2023-10-10 20:21:37 +01:00
Daoud Clarke
a55a027107 Store stats in redis 2023-09-29 13:37:54 +01:00
Daoud Clarke
019095a4c1 Exclude blacklisted domains 2023-09-22 21:53:53 +01:00
Daoud Clarke
8d64af4f1b Keep track of curated couments 2023-04-30 18:25:48 +01:00
Rishabh Singh Ahluwalia
30aff3b920 Add pytest, unit tests for completer,gh actions ci 2023-02-22 21:37:10 -08:00
Daoud Clarke
d400950689 Add script to process historical data 2022-06-18 15:31:35 +01:00
Daoud Clarke
a003914e91 Fix boto3 dependency 2022-06-17 22:14:55 +01:00
Daoud Clarke
e2eb405083 Combine crawler and search servers 2022-06-16 22:49:41 +01:00
Daoud Clarke
aaca8b2b6e Record historical batches via the API 2022-06-05 09:15:04 +01:00
Daoud Clarke
af6a28fac3 Implement learning to rank feature extraction and thresholding 2022-03-20 22:01:45 +00:00
Daoud Clarke
e6273c7f76 WIP: include metadata in index - using struct approach 2022-02-18 22:12:22 +00:00
Daoud Clarke
7d829bc319 Use python 3.10; complete terms 2022-01-30 23:24:00 +00:00
nitred
a72a08a7d9 added config and binary/entrypoint for mwmbl.tinysearchengine
- using pydantic to validate the config
- added a default bootstrap config at config/tinysearchengine.yaml
- refactored app.py to include parsing CLI argument using argparse
- refactored app.py to use fewer global variables
- added "mwmbl-tinysearchengine" binary/entrypoint in pyproject.toml
- updated Dockerfile to work with these changes and added comments to it
2021-12-29 15:26:33 +01:00
nitred
c02c052281 Fixes #12, Added dependencies for indexer as extra or extra_requires
- dependencies for indexer can be installed using "pip install .[indexer]" or "poetry install -E indexer"
2021-12-27 15:46:24 +01:00
Daoud Clarke
9c65bf3c8f WIP: implement docker image. TODO: copy index and set the correct index path using env var 2021-12-22 23:21:23 +00:00
Daoud Clarke
23eb341832 Add search page 2021-12-14 22:01:59 +00:00
Daoud Clarke
2844c1df75 Index common crawl data 2021-12-13 11:23:01 +00:00
Daoud Clarke
65b366d30d Add spacy 2021-12-12 20:58:44 +00:00
Daoud Clarke
c46257c6d1 Use our own filesystem-based queue 2021-12-11 16:57:17 +00:00
Daoud Clarke
14817d7657 Optimise imports 2021-12-05 20:38:05 +00:00
Daoud Clarke
312f32bf61 Add common crawl extract script and dependency management with poetry 2021-12-05 20:31:49 +00:00