some instructions
This commit is contained in:
parent
2b46e8899f
commit
5eb94a6834
4 changed files with 58 additions and 1 deletions
20
LICENCE
Normal file
20
LICENCE
Normal file
|
@ -0,0 +1,20 @@
|
||||||
|
Copyright 2020 Owen Bell
|
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
|
copy of this software and associated documentation files (the "Software"),
|
||||||
|
to deal in the Software without restriction, including without limitation
|
||||||
|
the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||||
|
and/or sell copies of the Software, and to permit persons to whom the
|
||||||
|
Software is furnished to do so, subject to the following conditions:
|
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be included
|
||||||
|
in all copies or substantial portions of the Software.
|
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||||
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||||
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||||
|
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
|
||||||
|
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
|
||||||
|
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
||||||
|
OTHER DEALINGS IN THE SOFTWARE.
|
||||||
|
|
37
README.md
Normal file
37
README.md
Normal file
|
@ -0,0 +1,37 @@
|
||||||
|
# searpl
|
||||||
|
|
||||||
|
searpl is a small php search engine with the following features:
|
||||||
|
|
||||||
|
-[x] robot.txt compliant
|
||||||
|
-[x] sqlite, so theres no need to run some fancy database daemon
|
||||||
|
-[x] javascript-free
|
||||||
|
-[ ] it uses a cloudflare cdn for the search button icon,
|
||||||
|
but you can block it without impacting much.
|
||||||
|
-[x] read-only database, nothing is written except with the shell
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## licensing
|
||||||
|
searpl is licenced under an MIT licence, see [LICENSE](LICENSE)
|
||||||
|
for more information
|
||||||
|
|
||||||
|
## setup
|
||||||
|
this guide assumes you have shell access and are comfortable
|
||||||
|
using command line tools like git.
|
||||||
|
|
||||||
|
- make sure you have php, php-pdo, wget, sqlite3 and git installed
|
||||||
|
- go in your `htdocs`, `public_html` or whatever and git clone
|
||||||
|
this repo
|
||||||
|
- `touch db.sqlite` to create the database
|
||||||
|
- copy the contents of `create.sql` and paste it into the prompt
|
||||||
|
on `sqlite3 db.sqlite` to create the table
|
||||||
|
|
||||||
|
## crawling
|
||||||
|
to crawl a site, do `./urls.sh https://example.com`
|
||||||
|
|
||||||
|
to recursively crawl, change the recursion limit with -l
|
||||||
|
|
||||||
|
```
|
||||||
|
./urls.sh -l5 https://example.com
|
||||||
|
```
|
||||||
|
|
2
urls.sh
2
urls.sh
|
@ -6,4 +6,4 @@ grep '^--' wg | awk '{ print $3 }' \
|
||||||
|
|
||||||
sleep 10
|
sleep 10
|
||||||
|
|
||||||
php crawl.php $(cat ur)
|
php crawl.php $(cat ur | shuf)
|
||||||
|
|
Loading…
Reference in a new issue