Commit graph

499 commits

Author SHA1 Message Date
Daoud Clarke
c01129cdb9 Merge branch 'master' of github.com:mwmbl/mwmbl 2022-12-27 10:25:41 +00:00
Daoud Clarke
26351a1072 Use the correct storage location in prod 2022-12-27 10:24:48 +00:00
Daoud Clarke
f3f3831a97
Merge pull request #83 from omasanori/spacy-deps-rework
Rework installation of spaCy models for clarity
2022-12-27 10:20:52 +00:00
Masanori Ogino
71187a3938 Rework installation of spaCy models for clarity
- Install the wheel package for compatibility with future pip
- Use `spacy download` for installing model(s)
- Use `spacy validate` for checking model compatibility explicitly

Signed-off-by: Masanori Ogino <167209+omasanori@users.noreply.github.com>
2022-12-27 11:33:52 +09:00
Daoud Clarke
d85067ec09 Remove apt command 2022-12-24 20:20:53 +00:00
Daoud Clarke
1ef60e8d5d Put install in correct place 2022-12-24 20:18:02 +00:00
Daoud Clarke
8e613dd368 Install psql client 2022-12-24 20:13:53 +00:00
Daoud Clarke
80282cfc7a Exclude a domain 2022-12-24 19:59:56 +00:00
Daoud Clarke
8676abbc63 Format fetched url 2022-12-24 19:59:15 +00:00
Daoud Clarke
57295846cb
Update README.md 2022-12-21 21:49:56 +00:00
Daoud Clarke
0a4e1e4aee Add endpoint to fetch a URL and return title and extract 2022-12-21 21:15:34 +00:00
Daoud Clarke
c7571120cc Implement validation 2022-12-21 15:32:30 +00:00
Daoud Clarke
061462460b Separate out the curation to make it easier to store in a comment 2022-12-20 19:11:01 +00:00
Daoud Clarke
6cf27fa47f Fix serialisation issue 2022-12-19 23:19:32 +00:00
Daoud Clarke
b559a50506 Require the whole result 2022-12-19 22:18:28 +00:00
Daoud Clarke
5eab543f3b Merge branch 'master' into user-registration 2022-12-19 21:53:11 +00:00
Daoud Clarke
a88a1a3e95 Rename some parameters; return curation ID 2022-12-19 21:51:26 +00:00
Daoud Clarke
efc8e8e383
Merge pull request #78 from mwmbl/make-dev-easier
Make it easier to run mwmbl locally
2022-12-19 21:50:54 +00:00
Daoud Clarke
31c27daca4 Add curations 2022-12-11 18:48:25 +00:00
Daoud Clarke
f89e1d6043 Create a post when beginning curation 2022-12-10 23:45:10 +00:00
Daoud Clarke
eadb7f3e28 Follow a begin curate/update curation workflow 2022-12-10 22:49:06 +00:00
Daoud Clarke
f8ab6092b0 Suggest using dokku instead of docker directly 2022-12-08 22:33:58 +00:00
Daoud Clarke
8aa51e548b Allow login 2022-12-08 22:23:48 +00:00
Daoud Clarke
cf6ceedfd5 Actually allow registration 2022-12-07 22:56:20 +00:00
Daoud Clarke
a50bc28436 Make it easier to rum mwmbl locally 2022-12-07 20:01:31 +00:00
Daoud Clarke
d8d7149f4a Start to implement user registration using Lemmy as a back end 2022-12-06 22:36:38 +00:00
Daoud Clarke
c0f89ba6c3
Update matrix badge 2022-12-05 18:47:26 +00:00
Daoud Clarke
dd4dd8a752 Exclude an annoying web site 2022-12-02 21:29:06 +00:00
Daoud Clarke
40f9eade9a Update index name 2022-08-27 09:38:39 +01:00
Daoud Clarke
b6183e00ea
Merge pull request #74 from mwmbl/evaluate-indexing
Evaluate indexing
2022-08-27 09:37:22 +01:00
Daoud Clarke
cf253ae524 Split out URL updating from indexing 2022-08-26 22:20:35 +01:00
Daoud Clarke
f4fb9f831a Use terms and bigrams from the beginning of the string only 2022-08-26 17:20:11 +01:00
Daoud Clarke
619b6c3a93 Don't remove stopwords 2022-08-24 21:08:33 +01:00
Daoud Clarke
578b705609 Don't replace full stops and commas 2022-08-23 22:06:43 +01:00
Daoud Clarke
4779371cf3 Use a custom tokenizer 2022-08-23 21:57:38 +01:00
Daoud Clarke
b1eea2457f Script to index local batch for evaluation 2022-08-22 22:47:42 +01:00
Daoud Clarke
480be85cfd Fix bug in completions with duplicated terms 2022-08-14 22:03:50 +01:00
Daoud Clarke
f7660bcd27
Merge pull request #73 from mwmbl/completion
Completion
2022-08-13 23:55:22 +01:00
Daoud Clarke
627f82d19f Suggest searching Google if there are no search results 2022-08-13 23:54:57 +01:00
Daoud Clarke
f1c77d1389 Search google if there are no results 2022-08-13 23:47:48 +01:00
Daoud Clarke
fe5eff7b64 Exclude web.archive.org as we're only crawling that right now 2022-08-13 10:52:31 +01:00
Daoud Clarke
00705703f3 Require matching at least half the terms 2022-08-11 23:27:30 +01:00
Daoud Clarke
eda7870788 Restrict to https and strip the prefix and / on the end 2022-08-11 22:23:14 +01:00
Daoud Clarke
23e47e963b Simplify completions 2022-08-11 17:34:52 +01:00
Daoud Clarke
c6773b46c4
Merge pull request #72 from mwmbl/improve-ranking-with-multi-term-search
Improve ranking with multi term search
2022-08-10 21:43:51 +01:00
Daoud Clarke
74107667b4 Improve printing of search results in script 2022-08-10 21:43:13 +01:00
Daoud Clarke
3bcb7f42c1 Use heuristic ranker 2022-08-09 22:56:12 +01:00
Daoud Clarke
c1b9e70743 Add new LTR model 2022-08-09 22:47:59 +01:00
Daoud Clarke
57476ed2c8 Tweak features 2022-08-09 22:23:36 +01:00
Daoud Clarke
c99e813398 Get best-performing configuration 2022-08-09 20:56:15 +01:00