Update guide.html
This commit is contained in:
parent
b4eb3d8926
commit
61139dd566
1 changed files with 2 additions and 2 deletions
|
@ -242,7 +242,7 @@ You may want to run this on startup, easiest way to set that is with a cron job
|
|||
<br>
|
||||
<h3>Start the Crawler</h3>
|
||||
It is best to run the crawler in a screen session so that you can monitor its output. You can have more than one crawler running as long as you keep them in separate directories, include a symlink to the same robots folder, and also set the correct parameters on each.
|
||||
To view the parameters, type './cr -h'. Without any parameters set, you can only run one crawler (which is probably all you need anyway). If necessary, you can change the connection from 'localhost' to a different IP from inside cr.c, then rebuild.
|
||||
To view the parameters, type './cr -h'. Without any parameters set, you can only run one crawler (which is probably all you need anyway). If necessary, you can change the database connection from 'localhost' to a different IP from inside cr.c, then rebuild.
|
||||
<br>
|
||||
<br>
|
||||
Note that you may need to change the crawler's user-agent if you have issues indexing some websites. Pages that fail to index are noted inside of abandoned.txt.
|
||||
|
@ -413,7 +413,7 @@ If you need to stop the web crawler in a situation where it was accidently queue
|
|||
<hr>
|
||||
<h2><a name="scale">Scaling the Search Engine</a></h2>
|
||||
<br>
|
||||
You can help ensure sub-second search queries as your index grows by building MySQL replica servers on a local netowork close to eachother, run the core application AND replication tracker (rt) on one or more replica servers and point your reverse proxy to use it.
|
||||
You can help ensure sub-second search queries as your index grows by building MySQL replica servers on a local network close to eachother, run the core application AND replication tracker (rt) on one or more replica servers and point your reverse proxy to use it.
|
||||
Edit the servers.csv file for rt to indicate all available replica servers. If you have a machine with a huge amount of resources and cores, entering multiple duplicate entries to the same sever inside servers.csv (e.g. one for each core) works also.
|
||||
<br>
|
||||
<br>
|
||||
|
|
Loading…
Add table
Reference in a new issue