Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



SEO Class in Chicago, IL

Learn How To Optimize Your Website on July 26, 2013


Looking for personalized in-depth SEO training among your peers?



High Rankings is offering a 1-day customized SEO training class in Chicago. Class size is limited so please sign-up now if you want in!



 


Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo

Robots.txt May Result In Site Security Breach?


  • Please log in to reply
7 replies to this topic

#1 CaliforniaGirl

CaliforniaGirl

    HR 3

  • Active Members
  • PipPipPip
  • 63 posts
  • Location:Sydney, Australia

Posted 18 September 2007 - 06:39 PM

Hi all,

I have a client who received the following message during a routine security scan of his site:

QUOTE
1) The robots.txt file provides directory names where information is stored that should not be spidered by search engines. Any files stored in those directories may have useful information to help further compromise the server or perform other illegal activities. If I was a malicious individual the files in those directories would be high on my checklist to retrieve, and during an assessment the directories are put in the manually verify list.

If the robots.txt file is required for business or technical reasons and no options exist to lessen or mitigate the risk, than it becomes a risk acceptance exercise for the business.


Anyone heard of this before? Sounds a bit like propaganda....

CaliforniaGirl

#2 mcanerin

mcanerin

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,242 posts
  • Location:Calgary, Alberta, Canada

Posted 18 September 2007 - 07:42 PM

It is a known issue that if you have a robots.txt file that lists directories that you don't want HUMANS to visit, then it can be a security risk, simply because it's a list of all the directories that you are trying to keep people away from.

robots.txt is not security - it simply keeps robots away from certain areas of your site (at least, those robots that comply with the directive). To keep humans away, you need to implement actual file and directory level permissions.

Having said that, implying that the mere existence of a robots.txt file is a security risk is going too far. It's only a security risk if you list secret folders in it. If you have a robots.txt file that just says "allow robots" then there is no security risk at all. Additionally, if your robots.txt disallows areas like the cgi-bin and images folder, there is unlikely to be any issues, since you'd have to store your secret stuff in those folders for it to matter.

In short, the security assessment is correct as far as it goes, but IMO unless they have actually looked at the robots.txt file and found directories that should not be in there, then the warning seems to me to be more of a marketing ploy than an honest assessment (kind of like a sleazy SEO warning you that your website can't be found in some directories, omitting the fact that the directories are full of spam or owned by the SEO themselves).

A good security check does check the robots.txt file. But the existence of a robots.txt file by itself is NOT a security threat. It's the actual content that might be an issue.

Finally, I could list every secret directory in my file right up front without any problems, as long as I had file level security on them. The warning is only valid for unprotected directories, at which point I'd say that you have more serious problems than a robots.txt file.

A classic case is the robots.txt file for the US Whitehouse: http://www.whitehouse.gov/robots.txt

You'll see that it's huge and lists all sorts of juicy, apparently hacker-happy information about the site structure. But if you look closer, it basically only disallows indexing of the text-only version of webpages (preventing duplicate content) and a few other minor directories. Those same directories are open to the public and are not a security risk, because they are not dumb enough to put top-secret documents on the public website. I imagine that this robots.txt file would be flagged with code-red urgent paranoid warnings by this same security company, but no doubt the warnings would get nothing but a laugh by Whitehouse security.

Yes, it should be checked during a security audit. No, it's not usually a problem. Certainly not if it's a "standard" robots.txt file. It may only be an issue if someone got confused and tried to implement security with a robots.txt file, which is very rare. Like I said, they are not wrong for checking, but they should check further before sounding out a "sky is falling"-type warning.

I hope that helps.

Ian






#3 nethy

nethy

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 974 posts

Posted 18 September 2007 - 08:08 PM

am I misunderstanding the issue?

My understanding is:

robots.txt may be used to prevent access to certain files.

This is not a good way of doing preventing access to these files. The contrary is true. Listing thee files here may tip off the hackers.

Therefore, do not use it as a security measure.

Is there more to it?

#4 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 18 September 2007 - 08:27 PM

I'm not sure if you were summarizing the points that were trying to be made in the first post, of if that's your understanding nethy.

As Ian said, robots.txt is not designed to be a security measure. Period, full stop.

What it is designed to do is tell compliant spiders which files or folders to ignore. Nothing more, nothing less.

If you have something that needs to be secure you need to password protect it. Once you do that it doesn't matter if you list it in your robots.txt to be disallowed or not. They won't be able to get to those files/folders regardless.

I agree with Ian, unless there's something major missing from the original statement it's not a valid response. Smacks of a scare tactic to me, when there are lots of other things one should be much more scared about.

#5 maleman

maleman

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 677 posts

Posted 18 September 2007 - 08:33 PM

Great explanation Ian!

QUOTE
robots.txt may be used to prevent access to certain files


robot.txt is for (robots) which include search engines and other programs that follow the rules. It's there to tell legitimate crawlers not to index specific areas of the site.

It doesn't "block" access to folders or files. It merely says stay out and don't index.

Like Ian explained, if you want certain folders or files from being accessed by unauthorized users, some means of security must be in place to grant access permission to authorized users and keep everybody else out.

If you have the security properly set on restricted folders/files, robots and other unauthorized users will be denied access to those folders/files. Then you don't have to list those sensitive areas in robots.txt to keep search engines out and therefore hackers or whomever won't see those sensitive areas listed in robots.txt. And remember to put an index page on the root of secure directories.

Whew! I'm glad I typed that instead of having to do it orally. That would certainly be a mouthtful!

Hey Randy. Looks like you posted while I was typing. Sorry if I repeated what you already said.

Edited by maleman, 18 September 2007 - 08:40 PM.


#6 nethy

nethy

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 974 posts

Posted 18 September 2007 - 10:02 PM

QUOTE
I'm not sure if you were summarizing the points that were trying to be made in the first post, of if that's your understanding nethy.


Not summarising really, just making sure if I'm not missing a point.

The only security risk associated with robots.txt is using it for security. is this the conclusion?

#7 mcanerin

mcanerin

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,242 posts
  • Location:Calgary, Alberta, Canada

Posted 19 September 2007 - 02:34 AM

Yes smile.gif

Ian

#8 MaKa

MaKa

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 856 posts
  • Location:Llantwit Major, Wales, UK

Posted 19 September 2007 - 06:15 AM

QUOTE(nethy @ Sep 19 2007, 04:02 AM) View Post
The only security risk associated with robots.txt is using it for security.


Nicely summarised smile.gif





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users