Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Allow Generic Cgi Script In Robots.txt


  • Please log in to reply
3 replies to this topic

#1 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,154 posts
  • Location:Worthing - England

Posted 26 September 2007 - 10:22 AM

Hi,

I have written a script for my affiliates system, generally speaking I want to disallow the indexing of the cgi-bin, but allow indexing of one particular script, as it does a 301 reditrect to the correct page and so want link-juice to be passed on.

However,

if I have in my robots.txt file
CODE
allow: /cgi-bin/myscript.pl
disallow: /cgi-bin/


but peoples websites will have links that include a query string, does just allowing indexing / accesses to the script on it's own also allow the SE's to follow the link.

or am i missunderstanding the robots.txt file, is following links and evaluating link juice nothing to do with the robots.txt and is that purely indexing.

do I even need to allow the base perl script in the robots.txt for the link juice 301 redirect mechanism to work?

Advice understanding this is appreciated.

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 26 September 2007 - 05:24 PM

Not sure I understand the question.

The engeins are going to see the links to your perl script as going to different pages anyway since the query string value will make each a little bit different. Are you 301'ing those to another pge after saving the affiliate data?

If so, expressly allowing access to a script in your robots.txt should cover all query strings. robots.txt has a unexpressed wildcard at the end of each line, so if the engines are actually following the protocol they should see those affiliate queries, followed by the subsequent 301 redirect.

#3 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,154 posts
  • Location:Worthing - England

Posted 27 September 2007 - 04:59 AM

QUOTE
unexpressed wildcard at the end of each line
You understood perfectly Randy smile.gif

That just what I wanted to hear, they apply the wildcard to myscript.pl so myscript.pl?Affiliate=10 & myscript.pl?Affiliate=20 , are all allowed due to the allow for the basic myscript.pl line in the robots file.

Many thanks Randy.



#4 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 27 September 2007 - 05:10 AM

That's my understanding of the robots.txt protocol anyway 1dmf. I believe the engines follow this basic protocol, but have never tested it personally.

I'm sure Alan or someone who has tested it will be along to tell is it's wrong if the wildcard element is not there in real world practice, but I think you're safe.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users