Are you a Google Analytics enthusiast?
More SEO Content
Google Rank Extractor
Posted 26 August 2009 - 09:06 AM
Why it's happening has to do with how the code breaks down the parts and pieces of the referral string (it explodes on the ampersand character) and the order in which things take place. I'm 90% sure Google encodes &'s as "%26" but didn't have a chance in the early going to make sure they used that every time.
Assuming they do it the same way every time, on every version of Google, I've got a couple of ideas to overcome the issue via some simple character replacement. I'll need to find some time to do a bit of testing to sort it out, but it shouldn't be too difficult or cause any code bloat.
Posted 26 August 2009 - 10:07 AM
I'll have a play with this at some point but it's a bit hectic for me this weekend.
Wedding aniversary tomorrow and I travel up north to cousins surprise BBQ party for her 30th this weekend, so that's out the question.
hmm I'm assuming G! URL encodes the search term otherwise their own query string will break.
If so, the PERL version might be OK, let me know what you think Randy, will the ampersand always be URL encoded from Google if it's in the users search phrase?
it would be a poor show of G! if they don't, that's just standard URL encoding; you can't not encode ampersands and have a query string work!
Edit-> Just did a quick test, the ampersand in my query was correctly stored in the db and when used on the admin phrase search , it worked fine and returned the expected record.
As long as G! does URL encode ampersands in search phrases, the PERL version shouldn't break
I'll do more vigirous testing when i can and liaise with Randy as to his findings.
Edited by 1dmf, 26 August 2009 - 10:45 AM.
Posted 17 September 2009 - 01:58 AM
Thanks for Randy for building nice utility, we are using for our 2 ecommerce sites, and getting good results with users of firefox browsers, the referer url giving problems with other browsers, however I am in contact with randy and our team is working on to improve as per our needs, but still no luck at the moment with other browsers other than firefox.. still IE dominates the market.. as far as i have seen in my analytics account...
Also we have developed .NET version of the Randy's code and testing it with our sites... overall its great utility.. thanks again randy
Posted 17 September 2009 - 05:09 AM
I am running it on two of my sites and haven't come across any referrer issues, what exactly is the problem you are having?
Posted 17 September 2009 - 07:13 AM
IE's failure to handle encodeURIcomponent has been a known issue for years. The the document.referrer absolutely needs to be encoded before it gets sent through the ajax call. So the choices are either to create a special, lengthy workaround for older versions or IE, or simply let them fail and check twice for the document.referrer data. I chose the latter option. Especially since IE7 and IE8 fixed their little bug.
Long story short, I don't code for Mesozoic era browsers. Especially not when it's not possible to all information on every hit in the first place. If I was mean, before I coded a workaround for the GRE tool I'd be far more inclined include a bit of JS in there to pop up an alert box to tell those IE 5 and 6 users that they really need to upgrade their severely outdated, incredibly buggy and most definitely unsecure browser.
Posted 17 September 2009 - 08:21 AM
doing a bit of research I found this link http://xkr.us/articl...encode-compare/
it implies that encodeURIcomponent() is the preferred method of URI encoding, but if it is limited to which browsers it's compatible with, perhaps, for cross browser compatability it isn't the best to use.
This would then raise the issue over what detrimental effect using the lesser escape() might have on the extractor.
I use escape for most of my AJAX and Extranet coding and so far it has served me well and not caused me any problems, but that doesn't mean it won't if used on the Extractor tool.
I'll alter the code on my two productions sites which are running it and see what happens.
Of course if you know of specific issues this may cause Randy, your expert input would be very much apprecited.
Posted 17 September 2009 - 04:02 PM
Let's begin with the fact that the three possible choices --escape(), encodeURI(), and encodeURIcomponent()-- all encode different characters and do so at different levels.
I won't even mention that escape() has been deprecated since EMCAScript v3. Let's just leave it at escape() won't work for the type of user generated input the tool is dealing with.
encodeURI() kind of goes to the other extreme. It assumes the URI is a complete URI, thus does not encode reserved characters that have a special meaning in a URI. Sounds like a perfect choice then doesn't it? Well, it's not. Using encodeURI() by itself isn't something you can use for proper http GET or POST requests, like are typically used with ajax for xmlhttprequests. Why? Because encodeURI() doesn't encode special characters that are reserved in URI's it doesn't encode characters like "?", "+" and "=". All of which are going to appear in pretty much every single call sent to the GRE ajax routine. If you want to see truly wacky results try this one sometime, because you'll end up with two query strings and the high possibility that variables may end up getting split.
When you're dealing with user defined input it can get very, very nasty very, very quickly. So encodeURI() isn't the right tool for the job either.
So following this logic encodeURIcompenent is really the only choice, not just the best. It encodes exactly the stuff we need to be encoded and doesn't encode the stuff that we don't need or want to be encoded. With the only downside being that older versions of IE had a bug --sort of like they're trying to use the escape() method instead of a real encodeURIcomponent() method.
Bottom line, escape should not be used to encode http URI's. encodeURI() shouldn't be used if you're going to send the resulting string through an http GET or POST request, including ajax's xmlhttprequest. encodeURIcomponent() is the only tool for the job. It's just too bad IE didn't get their implementation of it right for several years when they were the only game in town. Especially since they said they supported it beginning with IE 5.5.
Posted 17 September 2009 - 05:58 PM
I know my use of escape() is 'invalid', who, me, never! , but the places where I use it , it's perfect for the job and was the only one we had originaly to use, old habits die hard I guess.
And I know encodeURIcomponent() is the best choice, if supported. I just hoped it could have been a lesser of two evils using an alternative encoding method.
All my experience has been limited to English and standard ASCII character sets, so 'funny' characters have never been an issue before, I'm glad you are here and have the experience to draw upon.
I'll still keep mine running with the change, it's in test, so it will be interesting to see if I notice any wierdness going on.
Posted 17 September 2009 - 08:40 PM
I only wish that I knew half as much about JS as I did at one time. Back before server side scripting I was a pretty good JS code hack. I'm too out of practice with it these days, which is too bad since Ajax stuff would be a lot easier for me if I'd kept it up all those years ago. Took me days to track this one down because I happened to have the good fortune of having a couple of alpha testers getting some really strange results because of the character sets being used.
Those who code in JS all the time knew both about the escape() issue (that's what was in the first alpha version) and also the older IE issues with encodeURIcomponent(). They got to ...errmm... had to explain it to me.
Posted 18 September 2009 - 06:36 AM
There is just no other way round the problem, well unless of course we roll some crazy piece of code trying to handle every possiblity under the sun, which really isn't practicle and could cause more problems than it may solve. Hence you not bothering eh
Why is everything such a good idea until you try to put it all together, man I'm fed up of microsoft's square pegs to eveyone elses round holes!
I guess the old saying is still true....
You can please all the people some of the time, and some of the people all the time.... etc..
It fits with browsers too
Posted 18 September 2009 - 08:45 AM
Nail -> Head. That's exactly how I ended up saying to heck with older versions of IE.
From the practical coding perspective what you'd have to do with IE5.5 through IE6 was do set up browser sniffing to find those, then run encodeURI() then do some additional hand rolled string replacement for the characters you know need to be encoded. So the same thing encodeURIcomponent() does, without actually calling this built in function. And then pray you didn't miss any characters that need to get encoded before being sent to the ajax routine! Long story short, it more hassle than it's worth. Especially as those older versions of IE fall farther into the sunset.
As an aside, the next major upgrade I have on my radar screen is to add in some IP tracking and a cookie setting routine into the mix, then setting up a small post sale-routine to tie conversions/sales back to the original hit and the keyword phrase used in the original hit as recorded by GRE.
I'm personally doing this sort of look back manually because the conversion testing/tracking solution I use records all the way back to the original hit. And I'm seeing some interesting phrases that absolutely kill on the conversion side of things, but which I don't always rank #1 for, even though they're 2nd or 3rd level phrases. I can typically spend 15 minutes or less tweaking a page so that it ranks #1 or #2 and use the GRE data to drive higher conversions and more sales.
Posted 18 September 2009 - 09:48 AM
Sounds like a plan to me!
It will certainly give some valuable info on conversions, and drop off.
One thing I noticed I have done vs your version is the additional stuff on the 'version' data.
I've chopped anything off past the part of the Google Version and then added (local) or (global) to the string.
What is that extra stuff? is it relating to adwords/adsense clicks?
I saw it as kinda making the 'Google Version' data a bit messy, but wondered if this is really usefull info, perhaps we need an addtional column for it?
Your help understanding the data and where it comes from would be real handy
A couple of examples i have in my data is..
Posted 19 September 2009 - 10:23 AM
As to what some of the codes mean, I believe "cse" stands for Custom Search Engine in Google lingo. I've not tested it specifically, but that's what comes to mind. Of course CSE's can be used on any site. So it could be a custom search engine on the target site itself, or it could also be one on an Adsense site too I suppose.
The variable to look for to see if it's a regional or global search for non-google.com searches is the variable named "cr".
Hopefully Google will do us all a favor and offer a roadmap to what the various variables mean once they've settled on what will and will not appear in the referral string. Hey, we can hope anyway!
Posted 20 September 2009 - 07:52 AM
Well like you say there is always hope. As long as it isn't fool hardiness.
I got the 'cr' bit for working out 'global' vs 'local' just couldn't work out the whole bunch of other stuff as it varies so much, it makes the version 'selector' look ugly as well as way too many 'versions' , If it is adsense etc. then you would still want to know which relative G! search was applied and if set to local search, but a separate section for the G! HTTP_REFERER as it were. so we could give info such as Adsense search , or perhaps even website hosting the adsense. But if we get too complicated, we'll end up writing our own Analytics program. Is that your ultimate Goal?
Posted 20 September 2009 - 09:29 AM
I don't have a goal with it. Just noticed it so decided to see what could be extracted. But if it gets too deep it destroys the thought of keeping it open source. A person would have to start charging just to justify development and support costs. Plus as a general rule I don't feel it's right to charge for something that depends so heavily on what someone else does or how they do it. That's a recipe for disaster IMHO.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users