| Important Announcement: ***Need an Affordable SEO Website Review?*** |
![]() ![]() |
Jun 9 2009, 11:03 AM
Post
#1
|
|
![]() HR 5 ![]() ![]() ![]() ![]() ![]() Group: Active Members Posts: 379 Joined: 10-August 06 User's local time: Feb 9 2010, 08:42 AM From: Dubuque, IA Member No.: 13,155 |
I have a new client and there robots.txt file is full of things to disallow such as INSTALL.txt, node/add, user/password/, etc. Most of which I doubt they really need. They also have a crawl delay of 10. What is the benefit for purpose of this. Their original tech person is gone and they aren't sure why its there.
, |
|
|
|
Jun 9 2009, 11:45 AM
Post
#2
|
|
![]() High Rankings Advisor Group: Admin Posts: 29,199 Joined: 21-July 03 User's local time: Feb 9 2010, 09:42 AM From: Ashland, MA Member No.: 2 |
According to Wikipedia, the crawl delay directive is:
QUOTE ...set to the number of seconds to wait between successive requests to the same server. For very large sites that might get a lot of bandwidth eaten up, it might make sense to have that within the robots.txt file. |
|
|
|
Jun 9 2009, 02:45 PM
Post
#3
|
|
![]() Convert Me! Group: Admin Posts: 17,377 Joined: 17-August 03 User's local time: Feb 9 2010, 08:42 AM Member No.: 551 |
Crawl delay is mainly to be used if your server cannot handle the load when it gets visited by multiple spiders from the same search engine.
For example, when Googlebot spiders your site and does a deep spider they may have several spiders hitting several pages in a relatively short duration. If this spidering causes the server to slow to a crawl you'd definitely want to either fix the root cause of the problem or institute a crawl delay. As Jill mentioned from the wiki article crawl delay can also be utilized for cases where the spiders are eating up too much bandwidth. Though frankly in that case I'd encourage you to upgrade your hosting package if things are that close. Real users are going to use up far more bandwidth and server load than the spiders. So if the spiders extra bit of usage is causing problems you're always better off to fix the root cause. Rather than limiting the spiders ability to crawl your site properly. |
|
|
|
Jun 10 2009, 01:08 AM
Post
#4
|
|
![]() HR 5 ![]() ![]() ![]() ![]() ![]() Group: Active Members Posts: 379 Joined: 10-August 06 User's local time: Feb 9 2010, 08:42 AM From: Dubuque, IA Member No.: 13,155 |
Can there be too much in a robots.txt file. In addition to crawl delay, this one site has nearly 40 entries. To me that seems too much and probably most of the exclusions serve no purpose. Granted my site isn't an e commerce site or requires a data base, but it only has two lines.
|
|
|
|
Jun 10 2009, 08:00 AM
Post
#5
|
|
![]() High Rankings Advisor Group: Admin Posts: 29,199 Joined: 21-July 03 User's local time: Feb 9 2010, 09:42 AM From: Ashland, MA Member No.: 2 |
QUOTE Can there be too much in a robots.txt file. In addition to crawl delay, this one site has nearly 40 entries. No. You might be surprised at the purpose the entries serve. There's no sense in allowing the robots to crawl areas that really shouldn't be indexed as you want them to focus on the areas of importance. It sounds like they're using their robots.txt file exactly as they should. |
|
|
|
Jun 10 2009, 02:19 PM
Post
#6
|
|
![]() HR 5 ![]() ![]() ![]() ![]() ![]() Group: Active Members Posts: 379 Joined: 10-August 06 User's local time: Feb 9 2010, 08:42 AM From: Dubuque, IA Member No.: 13,155 |
Thanks Jill. Robots.txt file is something I need to address more. My knowledge is pretty basic.
|
|
|
|
![]() ![]() ![]() |
|
Lo-Fi Version | Time is now: 9th February 2010 - 09:42 AM |