Daoud Clarke
c4e86ce313
Update readme for recent changes
2022-02-04 22:07:09 +00:00
Daoud Clarke
51f2dd2690
Merge branch 'master' of github.com:mwmbl/mwmbl
2022-02-04 21:49:40 +00:00
Daoud Clarke
9f78d19c8c
Merge pull request #41 from ColinEspinas/add-branding
...
Add branding to readme
2022-02-04 21:28:41 +00:00
ColinEspinas
b2e01d33e8
docs: better title display on readme
2022-02-04 20:53:55 +01:00
Colin Espinas
95c9bcfe3b
Merge branch 'mwmbl:master' into add-branding
2022-02-04 20:51:38 +01:00
ColinEspinas
cd57372a84
docs: added branding to readme and required assets files
2022-02-04 20:50:43 +01:00
Daoud Clarke
6e5e56f99a
New index; more pages
2022-02-04 18:08:23 +00:00
Daoud Clarke
bdf0fd1797
Merge pull request #39 from mwmbl/analyse-links
...
Analyse links
2022-02-03 19:33:52 +00:00
Daoud Clarke
2fc999b402
Count unique domains instead of links
2022-02-02 20:09:59 +00:00
Daoud Clarke
26e90c6e57
Merge branch 'master' into analyse-links
2022-02-02 19:48:47 +00:00
Daoud Clarke
07d4b36052
Merge pull request #38 from mwmbl/stop-indexing-partial-words
...
Improve handling of partial words
2022-02-02 19:48:31 +00:00
Daoud Clarke
d77b72d7df
Analyse links to find most popular ones
2022-02-02 19:47:38 +00:00
Daoud Clarke
fe6ace93e6
Improve handling of incomplete words:
...
- Correctly generate regex for incomplete vs complete words
- Return more than one top word from completer
- Correctly handle no terms
2022-01-31 21:20:59 +00:00
Daoud Clarke
7d829bc319
Use python 3.10; complete terms
2022-01-30 23:24:00 +00:00
Daoud Clarke
3c75dd1a74
WIP: implement term completer
2022-01-30 22:20:28 +00:00
Daoud Clarke
01a21337a9
Don't index partial words
2022-01-30 14:30:02 +00:00
Daoud Clarke
2ef8304919
Remove some debug print statements
2022-01-30 13:16:24 +00:00
Daoud Clarke
66696ad76b
Merge pull request #37 from mwmbl/index-mwmbl-crawl
...
Index mwmbl crawl
2022-01-30 13:12:06 +00:00
Daoud Clarke
5b89bbf05d
Index Mwmbl crawled data
2022-01-29 08:26:42 +00:00
Daoud Clarke
ef36513f64
Analyse the pages that are crawled most often
2022-01-29 07:06:53 +00:00
Daoud Clarke
70254ae160
Analyse crawled URLs and domains
2022-01-26 18:51:58 +00:00
Daoud Clarke
171fa645d2
Add script to export top domains
2022-01-23 22:04:30 +00:00
Daoud Clarke
908a9cf0b6
Merge pull request #36 from ColinEspinas/remove-old-frontend
...
Remove old front-end files and routes
2022-01-20 18:06:54 +00:00
ColinEspinas
3481ad372b
Removed old front-end files and routes
2022-01-19 23:33:37 +01:00
Daoud Clarke
a41088ca9a
Add CORS; revert back to previous index as it timed out deploying
2022-01-03 18:31:03 +00:00
Daoud Clarke
25918e42ef
Export URLs to sqlite for evaluation purposes
2022-01-02 20:06:13 +00:00
Daoud Clarke
ae7312c32a
Merge pull request #31 from nitred/fix-python-m-run
...
Using the app object to start uvicorn, instead of using a reference like "mwmbl.tinysearchengine.app:app"
2021-12-31 22:11:15 +00:00
nitred
fbdb93c86a
Using the app object to start uvicorn, instead of using a reference like "mwmbl.tinysearchengine.app:app"
...
- fixes the issue when running the server using python -m mwmbl.tinysearchengine.app
When running the server using python -m, uvicorn seems to spawn a new process or interpreter session.
At least it appears that way since already initialized & imported modules and variables appear to be uninitialized.
2021-12-31 02:15:16 +01:00
Daoud Clarke
e6655101ef
Add a component of the HN domain score when ranking
2021-12-30 22:20:10 +00:00
Daoud Clarke
f347fe29ac
Add .gcloudignore file to fix gcloud run deploy
2021-12-30 21:17:18 +00:00
Daoud Clarke
3f74229ae9
Explain pronounciation
2021-12-30 20:35:11 +00:00
Daoud Clarke
02bcef640c
Merge pull request #25 from ColinEspinas/search-debounce
...
Added debounce on search input
2021-12-29 20:59:29 +00:00
Daoud Clarke
3d7e655ebc
Merge pull request #24 from nitred/config-and-entrypoint
...
added config and binary/entrypoint for mwmbl.tinysearchengine
2021-12-29 20:54:23 +00:00
ColinEspinas
c636be9089
Added debounce on search input ( #8 )
2021-12-29 21:03:47 +01:00
nitred
a72a08a7d9
added config and binary/entrypoint for mwmbl.tinysearchengine
...
- using pydantic to validate the config
- added a default bootstrap config at config/tinysearchengine.yaml
- refactored app.py to include parsing CLI argument using argparse
- refactored app.py to use fewer global variables
- added "mwmbl-tinysearchengine" binary/entrypoint in pyproject.toml
- updated Dockerfile to work with these changes and added comments to it
2021-12-29 15:26:33 +01:00
Daoud Clarke
da8797f5ef
Merge pull request #18 from nitred/mwmbl-package
...
renamed package to mwmbl
2021-12-29 09:34:05 +00:00
Daoud Clarke
0b7bc90a05
Merge pull request #21 from ArcoMul/add-dev-instructions-to-readme
...
Add development instructions to README + fix .gitignore
2021-12-29 09:04:53 +00:00
Arco Mul
b6c1630953
Update .gitignore: fix ignoroing data folder in root of repository
2021-12-29 09:21:57 +01:00
Arco Mul
d5a612aa47
Update README: add development instructions
2021-12-29 09:21:26 +01:00
nitred
be40a15b27
Merge branch 'master' into mwmbl-package
2021-12-29 00:25:37 +01:00
Daoud Clarke
03ca368b2a
Merge pull request #17 from nitred/python-gitignore
...
added standard .gitignore template for python from the github/gitignore repo
2021-12-28 21:29:00 +00:00
Daoud Clarke
0baed3780d
Merge pull request #13 from nitred/indexer-dependencies-as-extra
...
Fixes #12 , Added dependencies for indexer as extra or extra_requires
2021-12-28 21:27:56 +00:00
Daoud Clarke
04d7cbdfe3
Merge pull request #11 from ArcoMul/fix-mobile-layout
...
Make page responsive for mobile devices
2021-12-28 21:26:18 +00:00
nitred
11eedcde84
renamed package to mwmbl
...
- renamed package to mwmbl in pyproject.toml
- tinysearchengine and indexer modules have been moved into mwmbl package folder
- analyse module has been left as is in the root of the repo
- import statements in tinysearchengine now use mwmbl.tinysearchengine
- import statements in indexer now use mwmbl.indexer or mwmbl.tinysearchengine or relative imports like .paths
- import statements in analyse now use mwmbl.indexer or mwmbl.tinysearchengine
- final CMD in Dockerfile now uses updated path mwmbl.tinysearchengine.app
- fixed a couple of import statement errors in tinysearchengine/indexer.py
2021-12-28 12:35:46 +01:00
nitred
91b357b6e2
added standard .gitignore template for python from the github/gitignore repo
2021-12-28 11:36:01 +01:00
nitred
c02c052281
Fixes #12 , Added dependencies for indexer as extra or extra_requires
...
- dependencies for indexer can be installed using "pip install .[indexer]" or "poetry install -E indexer"
2021-12-27 15:46:24 +01:00
Arco Mul
e773ff68e5
Decrease font-size of url so that the title stands out more
2021-12-27 12:36:10 +01:00
Arco Mul
4e41f68a46
Make page responsive for mobile devices
2021-12-27 12:29:09 +01:00
Daoud Clarke
acb2d19470
Merge pull request #6 from ndren/master
...
Do not send Referer
2021-12-27 08:57:07 +00:00
Andrei E
389d0abcc1
Do not send Referer
2021-12-26 22:48:03 +00:00