Are you a Google Analytics enthusiast?
More SEO Content
A Google Mistake? Can't Be
Posted 16 April 2009 - 12:17 PM
I've recently had google webmaster report a bunch of 404 errors for pages that do not exist on my site.
I'm wondering if anyone has ever seen this before.
It is for database generated urls and they seem to have the filename wrong
They're telling me that
does not exist.
And I say...well of course Mr. Don't-be-evil, why would it when the file is called:
The file has always been file_name.php
If I take any one of the url error pages, and put the underscore back in, it works fine.
I know you can remove urls but I have 250+ of these things and in their webmaster tools
it doesn't seem to have an easy way to input them, only directories or entire sites etc.
I've rooted through as many nooks and crannies of the site as I can, and I can't find
this file they are referring to. My sitemap.xml doesn't have it either.
Any ideas? Will these eventually just go away in the 404 not found list?
Posted 16 April 2009 - 12:42 PM
Google got those URLs somewhere. If you can resolve that issue the 404 problem will go away.
If there are inbound links from other sites causing the problem you should set up 301-redirects to send search engines and visitors to appropriate real pages.
Many 404 errors occur because someone embedded a broken link on their site. Sometimes you can (and should, if it's not too much trouble) get the other Web site to fix the link. The rest of the time you just need to live with 301-redirects.
Posted 18 April 2009 - 08:01 AM
Actually, they sometimes do. I've seen them put stupid search words into search boxes (especially on wordpress sites) and create all kinds of pages on websites.
Perhaps it's something like that?
In which case, you'll want to exclude via robots.txt the types of URLs that would be created via your search box.
Posted 24 April 2009 - 10:19 PM
Now a techie question to add to it.
I'm up to 1000+ pages it can't find because of the missing underscore.
So is there away to exclude these in robots.txt with a one liner?
If it can't find:
can I put in robots.txt
or some kind of syntax like that?
Then again, google says they're not indexed anyway because it couldn't find them...
Fine, but they're showing up in the Web Crawl not found errors and I'd rather not
have them in there and need to flip through a bunch of pages to find the real issues.
Will a robots.txt command to exclude them get them off of googles books so to speak,
so they won't show up in the not found error list?
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users