Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Quick Robots.txt Question


  • Please log in to reply
6 replies to this topic

#1 doogie88

doogie88

    HR 4

  • Active Members
  • PipPipPipPip
  • 260 posts

Posted 14 November 2008 - 02:09 AM

Does
User-agent: *
Disallow: /

Mean none of my site can be spidered?

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 14 November 2008 - 07:36 AM

Yep.

The / means everything. So everything will be disallowed.

#3 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 14 November 2008 - 10:21 AM

Randy is correct for all practical purposes. However, understanding a little more about the theory won't hurt, as it will help you to understand why "Disallow: /" covers everything.

QUOTE(Randy)
The / means everything. So everything will be disallowed.


The "/" does not literally mean everything. It means "everything beginning with a /", just as "/secret/" means "everything beginning with /secret/". Since URLs are specified as relative to the root directory of the site, everything DOES begin with a "/". This is why, for practical purpses, "/ means everything".

#4 mcanerin

mcanerin

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,242 posts
  • Location:Calgary, Alberta, Canada

Posted 14 November 2008 - 11:40 AM

... and just to make this particular thread complete, I'd like to point out that it disallows *compliant* spiders. Spambots, etc will ignore it blissfully. But it will work with major search engines like Google, Yahoo, etc.

Use robots.txt to disallow directories (including root, which means everything)
Use the robots metatag to disallow specific individual pages
Use the nofollow attribute to disallow links on pages

Use system level permissions for actual security and the ultimate disallow (overkill for preventing spidering of your images, required for preventing access to your customer database).

There is currently no way to disallow the indexing of the contents of a particular DIV, though I think there should be. The closest you can get would be to do something funky like creating an iframe, or using javascript to load the text in after the page loads.

Ian

#5 doogie88

doogie88

    HR 4

  • Active Members
  • PipPipPipPip
  • 260 posts

Posted 14 November 2008 - 12:44 PM

Thank you all.

I just realized for the last year and a half that was my robots.txt file!
Though there were a few pages that were also under the 'allow' part.

#6 mcanerin

mcanerin

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,242 posts
  • Location:Calgary, Alberta, Canada

Posted 14 November 2008 - 12:54 PM

Ouch.

Glad we could help - I suspect you SEO efforts for this site will be more fruitful from now on. wink1.gif

If you need a more complicated robots.txt than a simple "allow all", the you can do a search for "robots.txt generator" in your favorite search engine and use a tool I designed for the job. It's usually the first result.

Cheers,

Ian

#7 doogie88

doogie88

    HR 4

  • Active Members
  • PipPipPipPip
  • 260 posts

Posted 14 November 2008 - 03:33 PM

Yeah I pretty much have showall now.
I haven't done much with the site lately, but traffic has been down lately, and I noted on google webmaster tools there was a problem with my sitemap. So I re-upped it,, and still a problem, so I checked my robots file and seen the problem. I was banned from google for a while, and had someoen fix the site, so I guess they didn't want the spiders to go to any bad directories, so blocked them off. Just forgot to fix it after google put us back in!




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users