SEO Class in Chicago, IL
Are you a Google Analytics enthusiast?
More SEO Content
Will Not Having A Robots.txt Effect My Site..
Posted 27 July 2009 - 04:58 PM
Google's Web Master Guidelines recommend having a robots.txt file. Everything I have always been taught over the last 10 years is that it benefits your site to have a robots.txt, it helps google crawl your site, tells them what pages not to crawl etc.
My CEO/CTO came to me today to ask about it. Apparently, there is a credit card exchange standard called PCI. We had a scan of our technology and it was deamed a low priority threat that we use a Robots.txt file because to them its like a site map that someone can go in and use a packet sniffer or other techniques to exploit our security protocols.
He asked me about the possibility of using in page tags, but I know that bots get around those and there can be some issues with those if they are not executed properly. (for that matter, you can have problems if your robots.txt is not executed properly either)
I am concerned that it will impact the ranking of our site. Its pretty well optimized and its our major lead generation engine for the company. I don't have a great feeling about removing the robots.txt file. He told me its not really a requirement, but he wanted me to get more information.
Any help or info appreciated. DJKay
Posted 27 July 2009 - 05:08 PM
If you disallow certain directories in the robots file, then yes anyone can view that file and if they so choose to they can attempt to view those directories/files (or hack at them if there's some type of security). For PCI compliance I never have sensitive data available on the web server or accessible from the web server. Bad practice if you do.
Posted 27 July 2009 - 07:00 PM
And I'll add on that you really shouldn't be placing anything in your robots.txt that you actually want to keep bots and people from seeing. Those need to be password protected. And that no sensitive customer information should be housed anywhere that is potentially web accessible. That's best practice. If you're required to store stuff like payment info on your server make sure it's being encrypted with a secure key.
Now all of that said, being an early adopter of PCI standards I have to say it's nice to see someone else getting on the PCI bandwagon. Too bad the credit card companies haven't pushed that harder and started closing or threatening to close merchant accounts for those who have completely ignored PCI. There are still loads and loads of very questionable practices out there in the wild, most of which could be cured by simply forcing people to adhere to PCI.
Posted 28 July 2009 - 12:01 AM
Posted 28 July 2009 - 11:03 AM
So, from reading the pci report, it seems that the problem is the folders that are the saas installations of our product because those are the ones that could probably be hacked and thus get to the e commerce/credit card information. But I don't even understand how they could get to that because all of the credit card info is under a secure set up. But those hackers are nasty folks so hey, what do I know. It just must be the fact that the robots.txt is like a site map that they can follow.
Any way folks, what about guarding against duplicate content? Am I going to get into trouble there? DJKay
Posted 28 July 2009 - 11:05 AM
So, if you are saying I should have something, its 2 against, 1 for. DJKay
Posted 28 July 2009 - 11:32 AM
Having a robots.txt is just fine. Having one to use to avoid duplicates or to keep spiders away from certain areas you do not want to be indexed is fine. The caveat being that you don't want to use robots.txt to try to hide something might be sensitive customer data from anything or anybody, hackers included. robots.txt is the first place hackers look for possible exploits.
That's the reason having a robots.txt throws up the PCI warning. Because too many people do in fact use robots.txt in an attempt to hide sensitive data, instead of making sure those areas that contain sensitive data requires a secure login.
With the way you've described you're using robots.txt you're completely okay. That's 100% legitimate usage, and what robots.txt was designed to be used. That you're using it correctly however won't suppress the PCI warning, because too many use it incorrectly and haven't a clue that they're exposing their customer data to anybody and everybody who bothers to look.
Posted 28 July 2009 - 12:30 PM
Could use in page tags, but those from what I understand are not as effective against spiders.
Any thoughts, suggests, please weigh in at any time. DJKay
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users