I am not a web designer or SEO. When looking at my web stats for first time, I noticed that the 404 error was 3 times the number of visitors to my site, which as i understand from reading this forum, is not important. This is the explanation from the host:
"All of the 404 errors are for the robots.txt file. This is a special file that you can create for your account to give instructions to the crawling search engine bots. When a bot crawls your page it first checks if you have a robots.txt file. Since you do not have such a file it generates a 404 error message and this is completely normal. Please note that this is not a problem at all. The web site is indexed correctly by the search engine bots. They do not need it to be present."
Could this mean the bots are only allowed access to my home page? Will they automatically crawl all pages? Is there a way to confirm that bots aren't being denied access to subpages? Am i being parranoid?
Thanks in advance. Your forum is wonderful!
Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
International SEM | Social Media | Search Friendly Design | SEO | Paid Search / PPC | Seminars | Forum Threads | Q&A | Copywriting | Keyword Research | Web Analytics / Conversions | Blogging | Dynamic Sites | Linking | SEO Services | Site Architecture | Search Engine Spam | Wrap-ups | Business Issues | HRA Questions | Online Courses
Web Stats Show Lots Of 404 Errors
Started by
chili
, Jan 30 2007 02:42 PM
2 replies to this topic
#1
Posted 30 January 2007 - 02:42 PM
#2
Posted 30 January 2007 - 02:55 PM
Welcome to HR chili 
As the robots.txt is an exclusion protocol, the lack of a robots.txt file is taken as implied permission to crawl & index everything that they can find links to.
If you want to get rid of the errors, just upload a blank file which will have the same effect on compliant bots.
As the robots.txt is an exclusion protocol, the lack of a robots.txt file is taken as implied permission to crawl & index everything that they can find links to.
If you want to get rid of the errors, just upload a blank file which will have the same effect on compliant bots.
#3
Posted 31 January 2007 - 12:27 AM
I am not a web designer or SEO. When looking at my web stats for first time, I noticed that the 404 error was 3 times the number of visitors to my site, which as i understand from reading this forum, is not important.
If there is no robots.txt file, the spiders will consider any page on the site to be OK to fetch.
I would recommend creating a robots.txt file, even if you don't mind where the spiders travel within your site. The reason why is that the 404 errors generated by requests for this page might cause you to miss a more important missing file.
Errors on the server _should_ be monitored and corrected. That way, when a new error crops (for example if an important page on the site is deleted/renamed by mistake, or if a URL link on a newly minted page contains a typo) you will notice and correct it more quickly.
You can learn more about the standard by reading the Standard for Robot Exclusion.
The short answer is to make explicit what is currently implied: namely that all robots are welcome anywhere on the site. You could accomplish that by adding the following to the robots.txt file in the domain root:
# all robots welcome
User-agent: *
Disallow:
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users








