Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo
- - - - -

Non Public Pages And Disallows


  • Please log in to reply
3 replies to this topic

#1 melcat

melcat

    HR 2

  • Active Members
  • PipPip
  • 25 posts

Posted 22 February 2007 - 10:14 AM

Hello Everyone,

I researched this and could not find any posts that would answer my question. On our company website we have a link to the salesmen page. If you click this link you must put in a user name and password to access the pages. The content on these pages is for our salesmen only and we do not want the general public or our competition getting a look at them. The problem is that when a search engine spiders my site will the robots be able to access the password protected files? If so what are your suggestions to prevent this from happening. I am familiar with robots.txt files because I have another site that we do not allow the search engines to spider at all but I am unfamiliar with how to stop them from accessing certain pages.

P.S. There are a ton of files in this directory so if I have to use a robots.txt file can I just put it on the index file of the directory or do I have to put it on each individual page?

Sincerely and with Great Respect for your expertise,

Melinda

#2 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 22 February 2007 - 11:29 AM

If the only way to get to those pages is by correctly filling out and submitting a form, then search engines won't go to the pages. Even if the password was filled in for them, spiders don't click "submit" buttons. They find links and follow them, and it doesn't sound like you're providing any links.

Of course, if someone else were to put up a link to the protected pages, then simply protecting them with a login form isn't going to accomplish anything, so you should specifically tell the search engines via your robots.txt file that you don't want any of those pages indexed. And you don't have to specify each document in the robots.txt. If they're all in the same directory, just disallow the directory itself.

#3 MaKa

MaKa

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 856 posts
  • Location:Llantwit Major, Wales, UK

Posted 23 February 2007 - 09:41 AM

I've been running a test since December 06 on one of my sites. I've created a password protected folder on and linked to the page with username:password@www.mysite.com, which basically tells all the world and everyone how to access the pw protected folder. No robots/spiders have successfully accessed the page so far. I assume if robots/spiders don't access a protected folder they have the username/pw combination for they are certainly* not going to access a page where they have to guess them smile.gif

Edit: Just thought of checking the error log, there has been one robot that unsuccessfully tried to access the pw protected page and got a Unauthorized message.

Please note that the password protection was done on OS level and not as Randy mentioned a form quering for a username/pw that redirects to a file.

*Excluding bad bots used by crackers

#4 Alex Choo

Alex Choo

    HR 2

  • Active Members
  • PipPip
  • 35 posts

Posted 11 May 2007 - 02:43 AM

Hello,

I think we need to be clear on the difference between a robot-excluded page, and a password-protected page. They are not the same.

If a page or directory is disallowed in robots.txt, it means that the crawler will avoid it. But since anyone can read the robots.txt file, that means I can still visit that directory thru my browser. in short, robots.txt instruct crawlers only, not people.

But the password-protected page uses a different mechanism, usually via the .htaccess file. That stops bother crawlers and people from accessing it without the valid password.

Alex




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users