Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Robot.txt


  • Please log in to reply
28 replies to this topic

#1 alnany

alnany

    HR 2

  • Active Members
  • PipPip
  • 14 posts
  • Location:Irvine, CA

Posted 17 September 2003 - 08:09 PM

[Moved to [url=http://www.highrankings.com/forum/index.php?showforum=38]Technology and Coding forum[/url].]


What exactly should the robot.txt contain? Should all sites have one? Thanks.

Edited by Jill, 18 September 2003 - 10:20 AM.


#2 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts
  • Location:Columbia, SC

Posted 17 September 2003 - 08:20 PM

Welcome Al Nany! :lol:

Check out http://www.robotstxt.org .

The short answer is yes, you should have one. If you aren't worried about excluding any robots at all, save a blank document in notepad as robots.txt and upload that to the root directory of your site.

#3 Mel

Mel

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 353 posts

Posted 18 September 2003 - 05:22 AM

Note that the correct file name is robots.txt (plural). If you want an allinclusive specimen which is set up to exclude most of the badbots out there, Brett Tabbke offers his for free downloading

#4 fred

fred

    HR 4

  • Active Members
  • PipPipPipPip
  • 141 posts
  • Location:Near Montreal , Quebec, Canada

Posted 18 September 2003 - 09:18 AM

Hi

do you have a link to that file

thanks

#5 BrianR

BrianR

    Is it just me, or is it getting cooler in the evenings...?

  • Members
  • PipPipPipPipPipPipPip
  • 1,621 posts
  • Location:Chester, UK

Posted 18 September 2003 - 04:38 PM

Mel

I've just checked Brett Tabke's site (www.searchengineworld.com) and I can't find the template robots.txt file - do you know where it's hidden!?

Thanks.

BrianR

#6 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 18 September 2003 - 05:56 PM

Perhaps it's the regular robots.txt file that they have at WebmasterWorld. Did you look at www.webmasterworld.com/robots.txt?

Jill

#7 air-dog

air-dog

    HR 3

  • Active Members
  • PipPipPip
  • 105 posts
  • Location:Hull, UK

Posted 18 September 2003 - 06:09 PM

Hello All,

Mel mentioned "badbots" are some of these the spiders that capture email address', currently been discussed in Been Offered A Reciprical Link?

If so, does this mean that robots.txt files can be set up to deter these spammers?

#8 qwerty

qwerty

    HR 10

  • Moderator
  • 8,295 posts
  • Location:Somerville, MA

Posted 18 September 2003 - 06:26 PM

Bots don't automatically comply with the robots exclusion protocol. You can name as many spiders in your robots.txt you want, but a lot of them don't even look for it.

#9 BrianR

BrianR

    Is it just me, or is it getting cooler in the evenings...?

  • Members
  • PipPipPipPipPipPipPip
  • 1,621 posts
  • Location:Chester, UK

Posted 19 September 2003 - 04:15 PM

Thanks, Jill - that's it exactly.

But Qwerty's comment seems to say that using the robots.txt file is next to useless because the badbots listed just ignore it - so why bother??

BrianR

#10 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts
  • Location:Columbia, SC

Posted 19 September 2003 - 04:19 PM

It's kind of like locking the doors when you leave the house- it's not going to stop a determined thief who wants to get in, but at least you tried. :)

#11 qwerty

qwerty

    HR 10

  • Moderator
  • 8,295 posts
  • Location:Somerville, MA

Posted 19 September 2003 - 04:26 PM

I wasn't saying that there's no point in having a robots.txt. If you have files or entire directories you don't want indexed, you can block all compliant spiders. If there are particular compliant spiders you want to block, you can.

It's just that if you happen to have the user-agent name of some email harvester, putting in an entry like

user-agent: evilharvester
disallow: /

probably won't help you.

#12 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts
  • Location:Columbia, SC

Posted 19 September 2003 - 04:29 PM

Another point too-

Sometimes your competitors or other nosy people will look in robots.txt to see what you don't want to have indexed. If it's important, use password protection, not robots.txt/

#13 BrianR

BrianR

    Is it just me, or is it getting cooler in the evenings...?

  • Members
  • PipPipPipPipPipPipPip
  • 1,621 posts
  • Location:Chester, UK

Posted 19 September 2003 - 05:23 PM

Thanks Qwerty & Scottie - I understand now.

It's been a l-o-n-g week, and it's getting kinda late, and I'm just being thick!

BrianR

#14 guidaro

guidaro

    HR 1

  • Active Members
  • Pip
  • 6 posts
  • Location:Buenos Aires, Argentina

Posted 21 September 2003 - 07:37 PM

Hi, you can check more information about robots.txt at http://www.google.com/webmasters/, basically you may to allow or not search engines submit your site.
Guidaro :)

#15 Corey Bryant

Corey Bryant

    HR 2

  • Members
  • PipPip
  • 18 posts
  • Location:Castle Pines North, CO

Posted 22 September 2003 - 07:13 PM

You can also check out this one:

http://www.thinkhost.com/robots.txt




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users