| Important Announcement: *Lost Your Search Engine Traffic?* |
![]() ![]() |
Sep 17 2009, 10:30 AM
Post
#1
|
|
|
HR 2 ![]() ![]() Group: Active Members Posts: 33 Joined: 30-April 08 User's local time: Sep 2 2010, 05:55 PM Member No.: 20,756 |
In Google Webmaster tools I have noticed a lot of 404's found by google. It are all pages with a url like this:
www.thisisourdomain.com/hiptop/parse.php?url=http://www.thisisourdomain.com/our-subpage/ hiptop/parse.php is for sure not a file which exists on our website or which has existed before, at least not that I know of I was wondering if this can be some kind of blackhat tactic used by a competitor which would be used to create pages on our website with duplicate content or something like that? I think the pages were at some point indexed by google because now they result in a 404. I can't see from where the pages were (once I guess) linked because for "linked from" google webmaster tools indicates: "not available" This is not the only weird thing I have noticed in google wm tools lately, a lot of similar weird 404 urls have been popping up in wm tools. All of these were pages that once were visitable pages because our website was not properly structured. Our htaccess file allowed to make up urls withing some sub sections of our website which would be processed as being a news article. We now have installed Joomla and migrated all our content to the new cms system, this means that the weird urls that were once accessible result in a 404 now, I think this explains why google now comes up with the 404's, google once visited the pages (wherever they were linked from) and now can't find them anymore. However, I don't know who linked to these pages, google wm tools indicated for some 404's that they were linked from other weird internal pages which now also result in 404's, for others it indicates "not available" just like for the "hiptop" pages. Do you guys also think this is some kind of blackhat tactic used by someone trying to ruin my site? And what is the best way to deal with this? Currently, we implement 301 redirects for all 404's google wm tools comes up with, most of the time we redirect to the relevant sub section of our website (if applicable) or else to the homepage. I look forward to hearing your thoughts! |
|
|
|
Sep 17 2009, 01:00 PM
Post
#2
|
|
![]() HR 4 ![]() ![]() ![]() ![]() Group: Active Members Posts: 238 Joined: 6-April 07 User's local time: Sep 2 2010, 06:55 PM Member No.: 16,798 |
This reminds me of something that happened to me.
|
|
|
|
Sep 17 2009, 04:21 PM
Post
#3
|
|
![]() Convert Me! Group: Admin Posts: 17,540 Joined: 17-August 03 User's local time: Sep 2 2010, 11:55 AM Member No.: 551 |
Most likely it's nothing but referrer spam. Where they're hoping the referral shows up in your stats and you're crazy enough to click through to their often malware infected site.
Worst case scenario they get a free hit by sending out their little bot to spam your stats. Best case scenario they get to infect your PC with their malware, probably with a keystroke logger built into it, and gain yet another zombie PC for them to use. They're only problem is that you're seeing it in Google WMT rather than in your stats, so you don't see any referrer information. (IMG:style_emoticons/default/giggle.gif) |
|
|
|
Sep 17 2009, 04:42 PM
Post
#4
|
|
|
HR 9 ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Group: Active Members Posts: 3,942 Joined: 5-April 05 User's local time: Sep 2 2010, 09:55 AM From: Seattle, WA Member No.: 7,091 |
If the URL is structured YOURDOMAIN + something + YOURDOMAIN (which in your example it seems to be), that is usually an indication of a broken internal URL reference (such as the problem SERPico had a couple of years ago with a URL rewrite).
The fact the pages don't exist and transmit a 404 code rules out the possibility of a black hat spammer trying to use your site for links (unless he is really incompetent). Randy's suggestion that they might be trying to entice you into looking at a malware site with referrer spam seems unlikely to me. You wrote: QUOTE This is not the only weird thing I have noticed in google wm tools lately, a lot of similar weird 404 urls have been popping up in wm tools. All of these were pages that once were visitable pages because our website was not properly structured. Our htaccess file allowed to make up urls withing some sub sections of our website which would be processed as being a news article. You're rewriting things through .htaccess (or were). That is very probably where the issue arose. You could try redirecting the 404 URLs now and see what happens. I think it's interesting that Google cannot report where it found the URLs. That makes it seem to me that your change may have resolved the problem but they have these broken URLs in their database. |
|
|
|
Sep 23 2009, 02:52 PM
Post
#5
|
|
![]() HR 3 ![]() ![]() ![]() Group: Active Members Posts: 94 Joined: 28-July 04 User's local time: Sep 2 2010, 01:55 PM From: Dallas, Texas Member No.: 4,484 |
Most likely it's nothing but referrer spam. Where they're hoping the referral shows up in your stats and you're crazy enough to click through to their often malware infected site. Randy, how do they do that? I have one site that has a bunch records in the referrer reports (on the server reporting metrics, they don't use Google Analytics) that are from a bunch of porn sites. Obviously we don't have any links on any porn sites, so I've never understood how these get into my server logs. It sounds like a clear case of "referrer spam" to me, although this is the first I've ever heard of it (I've been out of things for a while). How does a bot write something to my log with the referral from a site that nobody actually came from, and more importantly, do you know of a way for me to prevent that? Thanks, Tom |
|
|
|
Sep 23 2009, 03:34 PM
Post
#6
|
|
![]() Convert Me! Group: Admin Posts: 17,540 Joined: 17-August 03 User's local time: Sep 2 2010, 11:55 AM Member No.: 551 |
It's pretty simple usually Tom.
They simply set up a little php or whatever page that accepts a string that ends up being a url address. Then they use a database of known domains and send that as the string. The php or whatever page then directs the browser to the page in question, showing the php page or another page entirely as the referrer. There are lots of other similar things out there. Some of them are not even referrer spam, but many are. For instance I have a few sites that continually try to load images that don't exist and never have existed on a couple of my sites. Those don't even show up in my regular logs or stats, but in my error logs. After awhile these idiots get on my last nerve enough that I set up a special .htaccess rule that tells them to f' off. (IMG:style_emoticons/default/angel_not.gif) |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 2nd September 2010 - 11:55 AM |