I have a new client and there robots.txt file is full of things to disallow such as INSTALL.txt, node/add, user/password/, etc. Most of which I doubt they really need. They also have a crawl delay of 10. What is the benefit for purpose of this. Their original tech person is gone and they aren't sure why its there.
,
Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
International SEM | Social Media | Search Friendly Design | SEO | Paid Search / PPC | Seminars | Forum Threads | Q&A | Copywriting | Keyword Research | Web Analytics / Conversions | Blogging | Dynamic Sites | Linking | SEO Services | Site Architecture | Search Engine Spam | Wrap-ups | Business Issues | HRA Questions | Online Courses
What Is The Point Of Using Crawl-delay In Robots.txt
Started by
ScottSalwolke
, Jun 09 2009 11:03 AM
5 replies to this topic
#1
Posted 09 June 2009 - 11:03 AM
#2
Posted 09 June 2009 - 11:45 AM
According to Wikipedia, the crawl delay directive is:
For very large sites that might get a lot of bandwidth eaten up, it might make sense to have that within the robots.txt file.
QUOTE
...set to the number of seconds to wait between successive requests to the same server.
For very large sites that might get a lot of bandwidth eaten up, it might make sense to have that within the robots.txt file.
#3
Posted 09 June 2009 - 02:45 PM
Crawl delay is mainly to be used if your server cannot handle the load when it gets visited by multiple spiders from the same search engine.
For example, when Googlebot spiders your site and does a deep spider they may have several spiders hitting several pages in a relatively short duration. If this spidering causes the server to slow to a crawl you'd definitely want to either fix the root cause of the problem or institute a crawl delay.
As Jill mentioned from the wiki article crawl delay can also be utilized for cases where the spiders are eating up too much bandwidth. Though frankly in that case I'd encourage you to upgrade your hosting package if things are that close. Real users are going to use up far more bandwidth and server load than the spiders. So if the spiders extra bit of usage is causing problems you're always better off to fix the root cause. Rather than limiting the spiders ability to crawl your site properly.
For example, when Googlebot spiders your site and does a deep spider they may have several spiders hitting several pages in a relatively short duration. If this spidering causes the server to slow to a crawl you'd definitely want to either fix the root cause of the problem or institute a crawl delay.
As Jill mentioned from the wiki article crawl delay can also be utilized for cases where the spiders are eating up too much bandwidth. Though frankly in that case I'd encourage you to upgrade your hosting package if things are that close. Real users are going to use up far more bandwidth and server load than the spiders. So if the spiders extra bit of usage is causing problems you're always better off to fix the root cause. Rather than limiting the spiders ability to crawl your site properly.
#4
Posted 10 June 2009 - 01:08 AM
Can there be too much in a robots.txt file. In addition to crawl delay, this one site has nearly 40 entries. To me that seems too much and probably most of the exclusions serve no purpose. Granted my site isn't an e commerce site or requires a data base, but it only has two lines.
#5
Posted 10 June 2009 - 08:00 AM
QUOTE
Can there be too much in a robots.txt file. In addition to crawl delay, this one site has nearly 40 entries.
No.
You might be surprised at the purpose the entries serve. There's no sense in allowing the robots to crawl areas that really shouldn't be indexed as you want them to focus on the areas of importance.
It sounds like they're using their robots.txt file exactly as they should.
#6
Posted 10 June 2009 - 02:19 PM
Thanks Jill. Robots.txt file is something I need to address more. My knowledge is pretty basic.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users





This topic is locked


