Daoud Clarke
|
918eaa8709
|
Rename django app to mwmbl
|
2023-10-10 13:51:06 +01:00 |
|
Daoud Clarke
|
a1d6fd8bb1
|
Start background processes
|
2023-10-08 21:20:32 +01:00 |
|
Daoud Clarke
|
b6fd27352b
|
Add crawler router
|
2023-10-08 14:13:38 +01:00 |
|
Daoud Clarke
|
ed64ca6c91
|
Merge branch 'main' into django-rewrite
|
2023-10-07 19:19:34 +01:00 |
|
Daoud Clarke
|
41061a695b
|
Add tests
|
2023-10-04 20:19:42 +01:00 |
|
Daoud Clarke
|
593c71f689
|
Exclude domains by keyword
|
2023-10-04 19:51:33 +01:00 |
|
Daoud Clarke
|
988f3fd2a9
|
Add more stats
|
2023-10-02 22:19:02 +01:00 |
|
Daoud Clarke
|
7c3aea5ca0
|
Temporary just select some URLs at random for initialization
|
2023-09-29 22:32:31 +01:00 |
|
Daoud Clarke
|
ab527c4b58
|
Use stats manager from redis URL
|
2023-09-29 21:48:36 +01:00 |
|
Daoud Clarke
|
0d795b7c64
|
Fix bugs with date method
|
2023-09-29 21:27:32 +01:00 |
|
Daoud Clarke
|
e1bf423e69
|
Get stats
|
2023-09-29 13:58:26 +01:00 |
|
Daoud Clarke
|
a55a027107
|
Store stats in redis
|
2023-09-29 13:37:54 +01:00 |
|
Daoud Clarke
|
db658daa88
|
Store stats in redis
|
2023-09-28 17:48:29 +01:00 |
|
Daoud Clarke
|
86a6524f0a
|
WIP add search API to Django
|
2023-09-24 08:09:18 +01:00 |
|
Daoud Clarke
|
bec00cdab5
|
Exclude additional domain
|
2023-09-22 23:06:04 +01:00 |
|
Daoud Clarke
|
7e054d0854
|
Better blacklist
|
2023-09-22 23:04:37 +01:00 |
|
Daoud Clarke
|
019095a4c1
|
Exclude blacklisted domains
|
2023-09-22 21:53:53 +01:00 |
|
Daoud Clarke
|
18dc760a34
|
Temp disable CORS
|
2023-05-20 23:23:43 +01:00 |
|
Daoud Clarke
|
01bf4c21df
|
Temporarily disable lemmy as connection is refused
|
2023-05-20 22:26:33 +01:00 |
|
Daoud Clarke
|
b5b37629ce
|
Clean unicode when formatting result
|
2023-05-20 22:11:51 +01:00 |
|
Daoud Clarke
|
dec7c4853d
|
Whitespace fix
|
2023-05-20 21:52:33 +01:00 |
|
Daoud Clarke
|
3e08c6e804
|
Check response status; provide an answer when registering
|
2023-05-20 21:51:57 +01:00 |
|
Daoud Clarke
|
8d64af4f1b
|
Keep track of curated couments
|
2023-04-30 18:25:48 +01:00 |
|
Daoud Clarke
|
f0592f99df
|
Require a curated boolean flag
|
2023-04-13 06:27:51 +01:00 |
|
Daoud Clarke
|
00b5438492
|
Track curated items in the index
|
2023-04-09 06:26:23 +01:00 |
|
Daoud Clarke
|
a87d3d6def
|
Store curated pages in the index
|
2023-04-09 05:31:23 +01:00 |
|
Daoud Clarke
|
61cdd4dd71
|
Merge branch 'main' into user-registration
|
2023-04-01 07:17:29 +01:00 |
|
Daoud Clarke
|
3e1f5da28e
|
Off by one error with page size
|
2023-04-01 06:40:03 +01:00 |
|
Daoud Clarke
|
91269d5100
|
Handle a bad batch
|
2023-04-01 06:35:44 +01:00 |
|
Rishabh Singh Ahluwalia
|
e9dfd40ecb
|
Merge pull request #98 from mwmbl/rishabh-fix-trim-data
Fix trimming page size logic while adding to a page
|
2023-03-28 08:18:53 -07:00 |
|
Rishabh Singh Ahluwalia
|
f232badd67
|
fix comma formatting
|
2023-03-27 22:18:10 -07:00 |
|
Rishabh Singh Ahluwalia
|
8e197a09f9
|
Fix trimming page size logic while adding to a page
|
2023-03-26 10:04:05 -07:00 |
|
Daoud Clarke
|
23688bd3ad
|
Merge branch 'master' into user-registration
|
2023-03-18 22:37:45 +00:00 |
|
Daoud Clarke
|
e5c08e0d24
|
Fix big with other URLs
|
2023-02-25 16:48:59 +00:00 |
|
Daoud Clarke
|
a24156ce5c
|
Initialize URLs by processing them like all other URLs to avoid bias
|
2023-02-25 13:45:03 +00:00 |
|
Daoud Clarke
|
6bb8bdf0c2
|
Initialize with new URLs
|
2023-02-25 10:48:22 +00:00 |
|
Daoud Clarke
|
5c94dfa669
|
Shuffle URLs before batching
|
2023-02-25 10:35:10 +00:00 |
|
Daoud Clarke
|
6ff62fb119
|
Ensure URLs in queue are unique
|
2023-02-25 10:34:09 +00:00 |
|
Daoud Clarke
|
c36e1dffcb
|
Remove picolisp as a top domain since there are duplicate URLs
|
2023-02-25 09:56:26 +00:00 |
|
Daoud Clarke
|
362f9bfa9e
|
Write page to the correct location (metadata size offset bug fix)
|
2023-02-24 21:46:18 +00:00 |
|
Daoud Clarke
|
bc6be8b6d5
|
Merge branch 'master' into update-urls-queue-quickly
|
2023-02-24 21:37:54 +00:00 |
|
Daoud Clarke
|
a03b76e5cc
|
Fix broken test
|
2023-02-24 21:37:32 +00:00 |
|
Daoud Clarke
|
c97d946fcf
|
Go back to processing 10,000 batches at a time
|
2023-02-24 21:29:42 +00:00 |
|
Rishabh Singh Ahluwalia
|
38a5dbbf3c
|
Merge pull request #94 from mwmbl/rishabh-port-configuration
Allow configuration of port
|
2023-02-23 07:31:07 -08:00 |
|
Rishabh Singh Ahluwalia
|
30aff3b920
|
Add pytest, unit tests for completer,gh actions ci
|
2023-02-22 21:37:10 -08:00 |
|
Rishabh Singh Ahluwalia
|
842aec19e2
|
Add port to args
|
2023-02-22 19:59:42 -08:00 |
|
Daoud Clarke
|
e890e56661
|
Offset by metadata size manually to increase compatibility
|
2023-02-05 15:49:09 +00:00 |
|
Daoud Clarke
|
5783cee6b7
|
Fix bugs
|
2023-01-24 22:52:58 +00:00 |
|
Daoud Clarke
|
77e39b4a89
|
Optimise URL update
|
2023-01-22 20:28:18 +00:00 |
|
Daoud Clarke
|
66700f8a3e
|
Speed up domain parsing
|
2023-01-20 20:53:50 +00:00 |
|