High Rankings Search Engine Optimization ForumHigh Rankings Advisor Search Marketing Newsletter

Welcome Guest ( Log In | Register )

Important Announcement: ***Need an Affordable SEO Website Review?***
 
Reply to this topicStart new topic
> Too Many Hits For My Custom 404 Page, should I be concerned?
johking
post Apr 24 2008, 10:15 AM
Post #1


HR 4
****

Group: Active Members
Posts: 113
Joined: 28-May 04
User's local time:
Feb 9 2010, 08:59 PM
From: Scotland
Member No.: 3,740



I notice that my custom 404 page seems to be getting a load of hits. I have checked all the links and found no problems.

I tried downloading the logs and it appears that it is bots that are visiting but I can't work out why they are getting to that page.

Have I set it up wrong?

In the .htaccess:
CODE
ErrorDocument 404 http://www.mysite.com/custom404.htm


Thanks

Jo

This post has been edited by Randy: Apr 24 2008, 10:53 AM
Reason for edit: Added code tags.
Go to the top of the page
 
+Quote Post
Randy
post Apr 24 2008, 10:54 AM
Post #2


Convert Me!
Group Icon

Group: Admin
Posts: 17,378
Joined: 17-August 03
User's local time:
Feb 9 2010, 01:59 PM
Member No.: 551



The ErrorDocument instruction looks fine.

You'd need to look deeper. Do your log files show any referring url for the 404 errors? And is this the major bots hitting a bad spot? Or simply one of the many, many spam bots out there that are up to no good?
Go to the top of the page
 
+Quote Post
johking
post Apr 24 2008, 12:16 PM
Post #3


HR 4
****

Group: Active Members
Posts: 113
Joined: 28-May 04
User's local time:
Feb 9 2010, 08:59 PM
From: Scotland
Member No.: 3,740



QUOTE(Randy @ Apr 24 2008, 04:54 PM) *
The ErrorDocument instruction looks fine.

You'd need to look deeper. Do your log files show any referring url for the 404 errors? And is this the major bots hitting a bad spot? Or simply one of the many, many spam bots out there that are up to no good?


Hi Randy!

mlbot - mean anything?

Jo
Go to the top of the page
 
+Quote Post
Randy
post Apr 24 2008, 06:49 PM
Post #4


Convert Me!
Group Icon

Group: Admin
Posts: 17,378
Joined: 17-August 03
User's local time:
Feb 9 2010, 01:59 PM
Member No.: 551



It's the one getting the 404?

MLbot is a relatively new one which is supposed to be robots.txt friendly, though I've honestly not seen it in my logs much. I did see it once or twice and read the page it gives as the info page in the server logs. If memory serves that said it was a spider that's trying to index Media, not web pages. I haven't tried simply blocking it via robots.txt since I don't have any media files so there wasn't really anything for it to index in the first place.

I'd try excluding that one right from robots.txt, which should get rid of the 404 hits too.
Go to the top of the page
 
+Quote Post
MaKa
post Apr 25 2008, 11:03 AM
Post #5


HR 6
******

Group: Active Members
Posts: 848
Joined: 21-November 05
User's local time:
Feb 9 2010, 07:59 PM
From: Ogmore-by-Sea, Wales, UK
Member No.: 9,487



It's probably not the problem, but have you verified that your 404 page actually returns a 404 code?
Go to the top of the page
 
+Quote Post
projectphp
post Apr 25 2008, 10:22 PM
Post #6


Lost in Translation
Group Icon

Group: Moderator
Posts: 2,202
Joined: 5-August 03
User's local time:
Feb 10 2010, 06:59 AM
From: Sydney Australia
Member No.: 283



AFAIK, adding the http:// redirects to that URL (with a 302 AFAIK). I THINK, but you'll need to test, that
CODE
ErrorDocument 404 /custom404.htm

Works better. see http://httpd.apache.org/docs/1.3/mod/core.html#errordocument
Go to the top of the page
 
+Quote Post
johking
post Apr 27 2008, 05:46 AM
Post #7


HR 4
****

Group: Active Members
Posts: 113
Joined: 28-May 04
User's local time:
Feb 9 2010, 08:59 PM
From: Scotland
Member No.: 3,740



Aha, that is very interesting. I will definitely take out the absolute URL then, but before I do...

I have had another look at the logs and am very puzzled by something.

I found a dodgy pdf that was no longer there which was triggering a few hits to www.mysite.com/custom404.htm

BUT also there are a load of these (which must be bumping up the stats):
GET /custom-404.htm
which is returning a 200 code

Does that mean that somewhere I have a link to that file that the bots are finding? I just have the custom-404 file in the root but as far as I can see there are no links to it other than the absolute URL within the htaccess.

Thanks for all your help so far

Jo
Go to the top of the page
 
+Quote Post
Jill
post Apr 27 2008, 09:37 AM
Post #8


High Rankings Advisor
Group Icon

Group: Admin
Posts: 29,201
Joined: 21-July 03
User's local time:
Feb 9 2010, 02:59 PM
From: Ashland, MA
Member No.: 2



Your 404 may not be returning a 404 header response, but a 200 ok one instead. You'll definitely want to make sure it's returning an actual 404 so that it doesn't get indexed (under multiple URLs no less!).
Go to the top of the page
 
+Quote Post
projectphp
post Apr 27 2008, 08:44 PM
Post #9


Lost in Translation
Group Icon

Group: Moderator
Posts: 2,202
Joined: 5-August 03
User's local time:
Feb 10 2010, 06:59 AM
From: Sydney Australia
Member No.: 283



Get Webbug for that.
Go to the top of the page
 
+Quote Post
johking
post Apr 28 2008, 03:57 PM
Post #10


HR 4
****

Group: Active Members
Posts: 113
Joined: 28-May 04
User's local time:
Feb 9 2010, 08:59 PM
From: Scotland
Member No.: 3,740



You are dead right - it is returning a 200. Not only that, a made-up url (ie a page that does not exist) is returning a 302!

Time to get on to the hosting company?

Jo

Go to the top of the page
 
+Quote Post
Jill
post Apr 28 2008, 03:59 PM
Post #11


High Rankings Advisor
Group Icon

Group: Admin
Posts: 29,201
Joined: 21-July 03
User's local time:
Feb 9 2010, 02:59 PM
From: Ashland, MA
Member No.: 2



QUOTE
Time to get on to the hosting company?


Yes, if they have control of your 404 error page, and .htaccess file, etc.
Go to the top of the page
 
+Quote Post
Randy
post Apr 28 2008, 04:30 PM
Post #12


Convert Me!
Group Icon

Group: Admin
Posts: 17,378
Joined: 17-August 03
User's local time:
Feb 9 2010, 01:59 PM
Member No.: 551



Did you change the ErrorDocument instruction in your .htaccess in case yours is one of those servers that automatically delivers a 302 if the full url is given?

If not, I'd try that first following the example given by projectphp above. Then re-test a non-existent url address again.

If you still get a 302 it'll be time to get on the host. They may have something in the virtual host configs that we can neither see, nor change.
Go to the top of the page
 
+Quote Post

  
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



This forum is sponsored by High Rankings, a Boston SEO Agency
- Lo-Fi Version Time is now: 9th February 2010 - 02:59 PM