SEO Class in Chicago, IL
Learn How To Optimize Your Website on July 26, 2013
High Rankings is offering a 1-day customized SEO training class in Chicago. Class size is limited so please sign-up now if you want in!
Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
Is Robot.txt Always Obeyed By Spiders?
#1
Posted 17 December 2006 - 11:19 PM
Is it true that spiders do not always obey robots.txt..?
If that's true has someone done any testing?
Or how can we proove that robots.txt is not always obeyed by spiders / bots ..?
Thanks
#2
Posted 17 December 2006 - 11:23 PM
#3
Posted 17 December 2006 - 11:32 PM
Would spiders from above engines ignore the robots.txt file...
Do you know what other engines will igonre the robots.txt...?
#4
Posted 18 December 2006 - 12:08 AM
Try signing up for sitemaps, I am pretty sure they have a sitemap checker.
#5
Posted 18 December 2006 - 12:28 AM
--Torka
#6
Posted 18 December 2006 - 04:56 AM
Would spiders from above engines ignore the robots.txt file...
- Google may index URLs that are protected by robots.txt, without reading or indexing the content at those URLs. This leads to the so-called "PIPs" (partially indexed pages).
- Google's AdwordsBot only obeys instructions directed specifically at it. It does not obey "User-Agent: *". This is only an issue if you you use Adwords on the domain.
#7
Posted 18 December 2006 - 08:33 AM
- Google may index URLs that are protected by robots.txt, without reading or indexing the content at those URLs. This leads to the so-called "PIPs" (partially indexed pages).
- Google's AdwordsBot only obeys instructions directed specifically at it. It does not obey "User-Agent: *". This is only an issue if you you use Adwords on the domain.
Perhaps I should have said that do spiders ignore some of the statements made in the robots.txt..?
You said Google may index URL's that are protected by robots.txt without reading or indexing the content at those URLs.
How can we show that the above happens..?
Thanks
#8
Posted 18 December 2006 - 08:35 AM
#9
Posted 18 December 2006 - 08:37 AM
Thanks for the quick reply Jill.
I will do that.
Thanks for your help.
Cheers
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users










