|
@@ -273,8 +273,11 @@ You may want to run this on startup, easiest way to set that is with a cron job
|
|
|
<br>
|
|
|
<br>
|
|
|
<h3>Start the Crawler</h3>
|
|
|
-It is best to run the crawler in a screen session so that you can monitor its output. You can have more than one crawler running as long as you keep them in separate directories, include a symlink to the same robots folder, and also set the correct parameters on each.
|
|
|
-To view the parameters, type './cr -h'. Without any parameters set, you can only run one crawler (which is probably all you need anyway). If necessary, you can change the database connection from 'localhost' to a different IP from inside cr.c, then rebuild.
|
|
|
+It is best to run the crawler in a screen session so that you can monitor its output. You can have more than one crawler running as long as you keep them in separate directories, include a symlink to the same robots folder, and also set the <b>correct parameters</b> on each.
|
|
|
+To view the parameters, type './cr -h'. Without any parameters set, you can only run one crawler (which might be all you need anyway). If necessary, you can change the database connection from 'localhost' to a different IP from inside cr.c, then rebuild.
|
|
|
+<br>
|
|
|
+<br>
|
|
|
+If using more than one crawler, update the variable '$num_crawlers' from inside of review.php and graveyard.php (line 73) to the number of crawlers you are using.
|
|
|
<br>
|
|
|
<br>
|
|
|
Note that you may need to change the crawler's user-agent if you have issues indexing some websites. Pages that fail to index are noted inside of abandoned.txt.
|