Corey, I think I've welcomed you before, but just in case I haven't-
SEO Class in Chicago, IL
Learn How To Optimize Your Website on July 26, 2013
High Rankings is offering a 1-day customized SEO training class in Chicago. Class size is limited so please sign-up now if you want in!
Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
Robot.txt
#16
Posted 22 September 2003 - 07:27 PM
Corey, I think I've welcomed you before, but just in case I haven't-
#17
Posted 23 September 2003 - 01:13 PM
#18
Posted 10 November 2003 - 01:34 PM
In October I had 394 entries via the robots.txt and 212 exited immediately (without looking at/crawling other pages). That's only 48% that entered the site. I have been having a difficult time getting these buggers to crawl new pages. Googlebot shows up every few days but does not go anywhere. Why? Are they looking for changes in the file before crawling anywhere else? Have I told them to stop? I have tested the site with our search spider and they all seem to work as planned. I'm stumped.
#19
Posted 10 November 2003 - 04:48 PM
#20
Posted 10 November 2003 - 04:54 PM
#21
Posted 10 November 2003 - 04:57 PM
It's also very handy if you want to get further into things-- start watching your logs for bots who never request it, and there you have your rogues that you should think about banning.
#22
Posted 11 November 2003 - 06:35 AM
There's nothing to worry about here. It seems to be the normal behaviour. I less often see the opposite behaviour where a robot keeps going after getting the robots.txt file. Usually another robot from the same family will come and look for other files. That's much more typical in my log files.... but these critters are hitting the robots.txt first and turning tail.
#23
Posted 12 November 2003 - 12:07 PM
If the first line is:
User-agent: *
Disallow:
But then goes on to say:
User-agent: SpankBot
Disallow: /
Would SpankBot be banned or would it review the first one?
Thanks!!!
#24
Posted 12 November 2003 - 12:30 PM
Just put in the
"User-agent: SpankBot
Disallow: /"
line, and don't bother with the other.
Other bots will read it, say "I am not SpankBot", and go on their merry way spidering your site.
That's how it's supposed to work, anyway. (Yes, I looked it up!!!)
#25
Posted 12 November 2003 - 12:42 PM
Definitive Answer: It would be banned. "*" means "any robot not mentioned elsewhere".Would SpankBot be banned or would it review the first one?
Hits on robots.txt are usually from robots. Web stats for "entry", "exit" and "path" don't work too well for robots, since robots can change IP address between requests and they don't accept cookies. Just ignore it.In October I had 394 entries via the robots.txt and 212 exited immediately (without looking at/crawling other pages).
#26
Posted 12 November 2003 - 04:11 PM
#27
Posted 12 November 2003 - 04:34 PM
So, there are a number of reasons.
Most people need not worry about them, but advanced users have plenty to keep up on their toes about!
#28
Posted 13 November 2003 - 11:01 AM
#29
Posted 02 December 2003 - 01:47 PM
I have a question about excluding certain spiders from my site. I stumbled across a site that will generate a robots.txt file to include what it feels are a whole bunch of "nasty" bots.
The URL is here for anyone who wants to take a look at it:
http://www.1-hit.com...t-generator.htm
As I scan the list I recognize many of the names, so I'm pretty confident that they are bots I'm not interested in indexing my site. But in reading this thread, it seems that the spiders may or may not abide by my robots.txt file, so is there any point in excluding these 20 or so spiders.
Thanks
Dzinerbear
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users









