Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Only Allowing The Homepage To Be Indexed


  • Please log in to reply
4 replies to this topic

#1 JakeG

JakeG

    HR 4

  • Active Members
  • PipPipPipPip
  • 212 posts
  • Location:Nottingham, UK

Posted 17 September 2007 - 10:22 AM

Is there a neat way to exclude all pages other than the homepage?

#2 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 17 September 2007 - 10:51 AM

It ain't easy! BUT, it's easier than allowing access to everything except the homepage (which is impossible using robots.txt).

With Google, you can try using a wildcard:

CODE
User-agent: Googlebot
Disallow: /*


Otherwise, you need to explicitly deny access to almost everything:

CODE
User-agent: *
Disallow: /a
Disallow: /b
Disallow: /c


That will disallow anything beginning with an a, b or c. You would need to add the rest (d to z, numbers, etc.).

WARNING: Please treat both the above with extreme care. Though theoretically accurate, you will need to test. The safest approach is to use the meta robots tag.

#3 JakeG

JakeG

    HR 4

  • Active Members
  • PipPipPipPip
  • 212 posts
  • Location:Nottingham, UK

Posted 17 September 2007 - 10:55 AM

Thanks, I'll experiment!

Maybe the back-end is set up in a way that will let me generate one meta tag for the homepage and a different one for all other pages quite easilly... will have a look.

Is the syntax of robots.txt case sensitive? Would I have to disallow /a and A/ ?

#4 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 17 September 2007 - 11:17 AM

Ah. you're actually wanting to do this for real. In that case I'd definitely go for the meta tag.

robots.txt is case sensitive if your server is (e.g. Apache) and not if your server isn't (e.g. IIS). That's not part of the standard (it's case-sensitive there), but it's my experience of all search engines. Otherwise, it be unworkable on an IIS server.

#5 JakeG

JakeG

    HR 4

  • Active Members
  • PipPipPipPip
  • 212 posts
  • Location:Nottingham, UK

Posted 30 September 2007 - 08:38 AM

Yep I'm doing it for real. It is a duplicate of an existing site where only the homepage is different, it has a different brand name and brings in lots of traffic from it so I don't want to lose the ranking.

Thanks for the info I'll go down the META tag route.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users