Add files via upload
This commit is contained in:
parent
5ed3d7df97
commit
1daef790c1
1 changed files with 2 additions and 2 deletions
|
@ -19,7 +19,7 @@ p { font-size:17px; margin-bottom:0px; margin-top:0px; }
|
|||
.url { font-size:15px; color: #3a5a0c; }
|
||||
.pin { font-size:14px; COLOR: #2e2e2e;}
|
||||
textarea:focus, input:focus{ outline: none;}
|
||||
blockquote { width: 700px; }
|
||||
blockquote { max-width: 700px; }
|
||||
input[type='number'] { width: 80px; }
|
||||
pre { width:700px; white-space: pre-wrap; word-wrap: break-word; }
|
||||
</style>
|
||||
|
@ -294,7 +294,7 @@ If using more than one crawler, update the variable '$num_crawlers' from inside
|
|||
Note that you may need to change the crawler's user-agent (CURLOPT_USERAGENT in cr.c and checkrobots.h) if you have issues indexing some websites. Pages that fail to index are noted inside of abandoned.txt.
|
||||
<br>
|
||||
<br>
|
||||
Make sure the robots folder exists. All robots.txt files are stored in the robots folder. They are downloaded once and then referenced from that folder on future updates. Clear this folder every few weeks to ensure robots.txt files get refreshed from time to time. You can also create custom robots.txt files for specific domains and store them there for the crawler to reference.
|
||||
Make sure the robots folder exists, or create one in the same directory as core. All robots.txt files are stored in the robots folder. They are downloaded once and then referenced from that folder on future updates. Clear this folder every few weeks to ensure robots.txt files get refreshed from time to time. You can also create custom robots.txt files for specific domains and store them there for the crawler to reference.
|
||||
To disable checking for robots.txt files, comment out the line calling the "checkrobots" function inside of cr.c.
|
||||
<br>
|
||||
<br>
|
||||
|
|
Loading…
Add table
Reference in a new issue