xfnw
|
e9b6f113d9
|
limit filesize of crawls
|
2021-10-22 11:43:33 -04:00 |
|
xfnw
|
f23fdaad28
|
use sqlite's FTS5 as the ranking algorithm
|
2021-10-20 16:41:36 -04:00 |
|
xfnw
|
cbb1173184
|
oops remove debug print
|
2021-06-24 14:08:08 -04:00 |
|
xfnw
|
9dcc667bc1
|
ignore script and style tags from content, and make newlines into spaces so words are not combined
|
2021-06-24 14:02:37 -04:00 |
|
xfnw
|
826e3c2b7c
|
delete page before downloading new one, so dead pages do not sit in the database
|
2021-01-27 20:33:25 -05:00 |
|
xfnw
|
3e051c0feb
|
better crawl.php logging
|
2021-01-08 17:35:14 -05:00 |
|
xfnw
|
459c295488
|
better crawl.php logging
|
2021-01-08 16:54:16 -05:00 |
|
xfnw
|
036e3addb2
|
re-crawl sites
|
2020-12-22 10:13:45 -05:00 |
|
xfnw
|
8c4421108b
|
dont track svgs and drop the / from the end of urls so they wont be duplicated
|
2020-12-15 09:59:19 -05:00 |
|
xfnw
|
69de9f49dc
|
working
|
2020-12-14 16:59:16 -05:00 |
|