Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo

Double Robots.txt?


  • Please log in to reply
5 replies to this topic

#1 incrediblehelp

incrediblehelp

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 591 posts
  • Location:Kentucky

Posted 31 March 2009 - 04:08 PM

I noticed the other day that a website had a robots.txt in the root of the domain and as well as one in the root of the blog directory. I am wondering how many other people out there do this? Do you find the bots listening to both of them properly?

My feeling is that you only need the one in the website root and direct the bot to do what you want from there?

Why use two of them?

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 31 March 2009 - 04:36 PM

robots.txt anywhere but the Root level will be ignored by the spiders. In fact it would surprise me if it's ever even queried. robots.txt is not like .htaccess where you can control things on a per directory level.

The only way a subdirectory robots.txt might be valid is the rare case where someone has a domain name parked on a subdirectory of another domain. Or possibly if the subdirectory is really a subdomain, though that one too is questionable in my mind and isn't something I've tested to see if spiders look for a robots.txt for each subdomain.

Maybe Alan knows the answer to that one?

#3 incrediblehelp

incrediblehelp

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 591 posts
  • Location:Kentucky

Posted 31 March 2009 - 05:07 PM

Actually that is what I figured Randy. Thanks for the feedback.

I have heard of different robots.txt for https and http sites before

#4 chovy

chovy

    HR 2

  • Members
  • PipPip
  • 20 posts

Posted 31 March 2009 - 06:22 PM

open: ./htdocs/robots.txt:

User-agent: Googlebot
Disallow: /blog/

#5 Ron Carnell

Ron Carnell

    HR 6

  • Moderator
  • 966 posts
  • Location:Michigan USA

Posted 01 April 2009 - 12:07 AM

QUOTE
Or possibly if the subdirectory is really a subdomain, though that one too is questionable in my mind and isn't something I've tested to see if spiders look for a robots.txt for each subdomain.

They do, Randy. They do.

FWIW, I almost always back up a file before modifying it. My ex-wife always said I had trust issues? At any rate, I probably have a few copies of robots.txt laying around on more than a few sites. I don't worry about it because, as you pointed out, the only one that counts is in the root.


#6 icecape67

icecape67

    HR 2

  • Active Members
  • PipPip
  • 26 posts

Posted 01 April 2009 - 06:26 AM

QUOTE(Ron Carnell @ Apr 1 2009, 12:07 AM) View Post
They do, Randy. They do.

FWIW, I almost always back up a file before modifying it. My ex-wife always said I had trust issues? At any rate, I probably have a few copies of robots.txt laying around on more than a few sites. I don't worry about it because, as you pointed out, the only one that counts is in the root.


or even better, source control everything; and i mean EVERYTHING (ok, maybe not the wife)




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

SPAM FREE FORUM!
 
If you are just registering to spam,
don't bother. You will be wasting your
time as your spam will never see the
light of day!