Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
Robots.txt Page
#1
Posted 25 March 2004 - 11:18 PM
#2
Posted 25 March 2004 - 11:28 PM
There was some speculation at one point that you should always have a robots.txt, but this is bad information I believe. If you had to have a robots.txt file before a search engine would index your site, there would be a whole lot less sites in the search engine databases.
#3
Posted 26 March 2004 - 08:42 AM
However from the server environment side of things it's something I always recommend you have, even if the file is blank. If for no other reason than having one keeps your server from having to cycle through its 404 Not Found routine. Not a huge deal by any stretch of the imagination, and won't affect your ranking one iota, but it's still the proper approach to take in my view.
#4
Posted 26 March 2004 - 03:19 PM
Let me emphasize that these errors ONLY show up on my stats page and the lack of a robots.txt or favicon do not affect the user/customer in any way when they browse the site.
b.
#5
Posted 26 March 2004 - 07:13 PM
I recently looked at my logs and found one entry as: unknown robot identified as 'crawl'. It takes a lot of bandwidth each day. Does anyone know if this is just one hungry spider or a conglomerate of unidentified robots?
#6
Posted 26 March 2004 - 10:07 PM
The problem with the bad bots --email harvesters and such-- is that they don't normally obey a robots.txt exclusion anyway. So it's kind of a waste to throw them into your robots.txt file. They simply ignore it.
If you're so inclined you can however totally block them from your site via their IP number if you would like. But you have to be a bit careful with that to make sure you're not blocking legitimate users or good bots.
#7
Posted 27 March 2004 - 06:19 AM
#8
Posted 27 March 2004 - 11:18 AM
It's not an exact measurement of course, but that will get you as close as seeing how many times the file wasn't found.
#9
Posted 31 March 2004 - 04:07 PM
Disallow: "
Is that right? From my understanding unless you specifically block a spider in your robots.txt then it will be allowed. So why would google be saying that you specifically have to allow it? Or am I misreading something?
Thanks.
#10
Posted 31 March 2004 - 04:24 PM
Either that, or it's just an error on their part. I have an AdSense account on a domain with a fairly long robots.txt, with no mention of that bot at all, and I've had no problems with the spider getting in and serving the ads.
#11
Posted 31 March 2004 - 07:19 PM
Thanks!
Jill
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users








