Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo

Google Rank Extractor


  • Please log in to reply
98 replies to this topic

#16 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 26 August 2009 - 09:06 AM

Without even looking at the code I know why it's doing that piskie, and in fact already knew in the back of my head it would be an issue. wink1.gif

Why it's happening has to do with how the code breaks down the parts and pieces of the referral string (it explodes on the ampersand character) and the order in which things take place. I'm 90% sure Google encodes &'s as "%26" but didn't have a chance in the early going to make sure they used that every time.

Assuming they do it the same way every time, on every version of Google, I've got a couple of ideas to overcome the issue via some simple character replacement. I'll need to find some time to do a bit of testing to sort it out, but it shouldn't be too difficult or cause any code bloat.

#17 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 26 August 2009 - 10:07 AM

You will notice as I have 'escaped' the admin form vars, it won't break when doing the DB query wink1.gif

I'll have a play with this at some point but it's a bit hectic for me this weekend.

Wedding aniversary tomorrow and I travel up north to cousins surprise BBQ party for her 30th this weekend, so that's out the question.

hmm I'm assuming G! URL encodes the search term otherwise their own query string will break.

If so, the PERL version might be OK, let me know what you think Randy, will the ampersand always be URL encoded from Google if it's in the users search phrase?

it would be a poor show of G! if they don't, that's just standard URL encoding; you can't not encode ampersands and have a query string work!

Edit-> Just did a quick test, the ampersand in my query was correctly stored in the db and when used on the admin phrase search , it worked fine and returned the expected record.

As long as G! does URL encode ampersands in search phrases, the PERL version shouldn't break smile.gif

I'll do more vigirous testing when i can and liaise with Randy as to his findings.


Edited by 1dmf, 26 August 2009 - 10:45 AM.


#18 kashyap_rajput

kashyap_rajput

    HR 4

  • Active Members
  • PipPipPipPip
  • 127 posts
  • Location:India

Posted 17 September 2009 - 01:58 AM

Hello

Thanks for Randy for building nice utility, we are using for our 2 ecommerce sites, and getting good results with users of firefox browsers, the referer url giving problems with other browsers, however I am in contact with randy and our team is working on to improve as per our needs, but still no luck at the moment with other browsers other than firefox.. still IE dominates the market.. as far as i have seen in my analytics account...

Also we have developed .NET version of the Randy's code and testing it with our sites... overall its great utility.. thanks again randy smile.gif

regards
Kashyap

#19 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 17 September 2009 - 05:09 AM

Don't forget to let me know what URL referrer problems you are having so I can ensure the PERL version is kept up-to-date and error free.

I am running it on two of my sites and haven't come across any referrer issues, what exactly is the problem you are having?

#20 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 17 September 2009 - 07:13 AM

Kashyap hit a known issue actually 1dmf. It's one I decided not to code a workaround for. lol.gif

The issue revolves around the fact that older versions of IE (~<6 and earlier) have never handled Javascript's encodeURIcomponent function correctly. We're talking about browsers that are at least 2 years out of date at this point, browsers that really need to be upgraded. And coding around them for encodeURIcomponent is a difficult and lengthy process, so I just let those db additions fail silently since I knew that not nearly every Google referred hit was carrying the new ranking position information so wasn't being recorded anyway.

IE's failure to handle encodeURIcomponent has been a known issue for years. The the document.referrer absolutely needs to be encoded before it gets sent through the ajax call. So the choices are either to create a special, lengthy workaround for older versions or IE, or simply let them fail and check twice for the document.referrer data. I chose the latter option. Especially since IE7 and IE8 fixed their little bug.

Long story short, I don't code for Mesozoic era browsers. Especially not when it's not possible to all information on every hit in the first place. If I was mean, before I coded a workaround for the GRE tool I'd be far more inclined include a bit of JS in there to pop up an alert box to tell those IE 5 and 6 users that they really need to upgrade their severely outdated, incredibly buggy and most definitely unsecure browser. giggle.gif

#21 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 17 September 2009 - 08:21 AM

Would using escape() not resolve the issue, as escape is URI encoding.

doing a bit of research I found this link http://xkr.us/articl...encode-compare/

it implies that encodeURIcomponent() is the preferred method of URI encoding, but if it is limited to which browsers it's compatible with, perhaps, for cross browser compatability it isn't the best to use.

This would then raise the issue over what detrimental effect using the lesser escape() might have on the extractor.

I use escape for most of my AJAX and Extranet coding and so far it has served me well and not caused me any problems, but that doesn't mean it won't if used on the Extractor tool.

I'll alter the code on my two productions sites which are running it and see what happens.

Of course if you know of specific issues this may cause Randy, your expert input would be very much apprecited.

#22 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 17 September 2009 - 04:02 PM

Yes there is a method to the madness in the choice I made to use encodeURIcomponent 1dmf. It gets a bit complicated though.

Let's begin with the fact that the three possible choices --escape(), encodeURI(), and encodeURIcomponent()-- all encode different characters and do so at different levels.

escape() was developed to encode Javascript itself, not URI's, so one could say that it's not the best function for the job. It will encode characters that should not be encoded, so is not only overkill but will give you extremely questionable results for some foreign letter characters or any characters that are non-ASCII characters. As a general rule escape() simply eliminates non-ASCII characters that appear in the affected string when used to encode these strings, as opposed to attempting to encode these non-ASCII characters or simply leaving them in there unencoded. In other words, you lose data if you try to send non-ASCII characters through escape(). Try escape encoding something like for example. (That maps to an &Ouml; character entity or "& #214;" html entity if you scrunch that together or U+006 in Unicode characters) It'll simply disappear when you decode it, even though it's a completely valid letter character that very well may show up in a search on some versions of Google.

I won't even mention that escape() has been deprecated since EMCAScript v3. giggle.gif Let's just leave it at escape() won't work for the type of user generated input the tool is dealing with.

encodeURI() kind of goes to the other extreme. It assumes the URI is a complete URI, thus does not encode reserved characters that have a special meaning in a URI. Sounds like a perfect choice then doesn't it? Well, it's not. Using encodeURI() by itself isn't something you can use for proper http GET or POST requests, like are typically used with ajax for xmlhttprequests. Why? Because encodeURI() doesn't encode special characters that are reserved in URI's it doesn't encode characters like "?", "+" and "=". All of which are going to appear in pretty much every single call sent to the GRE ajax routine. If you want to see truly wacky results try this one sometime, because you'll end up with two query strings and the high possibility that variables may end up getting split.

When you're dealing with user defined input it can get very, very nasty very, very quickly. So encodeURI() isn't the right tool for the job either.

So following this logic encodeURIcompenent is really the only choice, not just the best. It encodes exactly the stuff we need to be encoded and doesn't encode the stuff that we don't need or want to be encoded. With the only downside being that older versions of IE had a bug --sort of like they're trying to use the escape() method instead of a real encodeURIcomponent() method.

Bottom line, escape should not be used to encode http URI's. encodeURI() shouldn't be used if you're going to send the resulting string through an http GET or POST request, including ajax's xmlhttprequest. encodeURIcomponent() is the only tool for the job. It's just too bad IE didn't get their implementation of it right for several years when they were the only game in town. Especially since they said they supported it beginning with IE 5.5. angel_not.gif


#23 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 17 September 2009 - 05:58 PM

I knew there was a simpe explanation wink1.gif

I know my use of escape() is 'invalid', who, me, never! angel_not.gif , but the places where I use it , it's perfect for the job and was the only one we had originaly to use, old habits die hard I guess.

And I know encodeURIcomponent() is the best choice, if supported. I just hoped it could have been a lesser of two evils using an alternative encoding method.

All my experience has been limited to English and standard ASCII character sets, so 'funny' characters have never been an issue before, I'm glad you are here and have the experience to draw upon.

I'll still keep mine running with the change, it's in test, so it will be interesting to see if I notice any wierdness going on.

#24 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 17 September 2009 - 08:40 PM

As you said, old habits die hard. I have way too many of them myself. giggle.gif

I only wish that I knew half as much about JS as I did at one time. Back before server side scripting I was a pretty good JS code hack. I'm too out of practice with it these days, which is too bad since Ajax stuff would be a lot easier for me if I'd kept it up all those years ago. Took me days to track this one down because I happened to have the good fortune of having a couple of alpha testers getting some really strange results because of the character sets being used.

Those who code in JS all the time knew both about the escape() issue (that's what was in the first alpha version) and also the older IE issues with encodeURIcomponent(). They got to ...errmm... had to explain it to me. hysterical.gif

#25 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 18 September 2009 - 06:36 AM

dang, if you already ran it with escape() in alpha, my test is pointless. hey ho, looks like older browsers get missed.

There is just no other way round the problem, well unless of course we roll some crazy piece of code trying to handle every possiblity under the sun, which really isn't practicle and could cause more problems than it may solve. Hence you not bothering eh wink1.gif

Why is everything such a good idea until you try to put it all together, man I'm fed up of microsoft's square pegs to eveyone elses round holes!

I guess the old saying is still true....

You can please all the people some of the time, and some of the people all the time.... etc..

It fits with browsers too hysterical.gif

#26 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 18 September 2009 - 08:45 AM

hysterical.gif

Nail -> Head. That's exactly how I ended up saying to heck with older versions of IE.

From the practical coding perspective what you'd have to do with IE5.5 through IE6 was do set up browser sniffing to find those, then run encodeURI() then do some additional hand rolled string replacement for the characters you know need to be encoded. So the same thing encodeURIcomponent() does, without actually calling this built in function. And then pray you didn't miss any characters that need to get encoded before being sent to the ajax routine! Long story short, it more hassle than it's worth. Especially as those older versions of IE fall farther into the sunset.

As an aside, the next major upgrade I have on my radar screen is to add in some IP tracking and a cookie setting routine into the mix, then setting up a small post sale-routine to tie conversions/sales back to the original hit and the keyword phrase used in the original hit as recorded by GRE.

I'm personally doing this sort of look back manually because the conversion testing/tracking solution I use records all the way back to the original hit. And I'm seeing some interesting phrases that absolutely kill on the conversion side of things, but which I don't always rank #1 for, even though they're 2nd or 3rd level phrases. I can typically spend 15 minutes or less tweaking a page so that it ranks #1 or #2 and use the GRE data to drive higher conversions and more sales.

#27 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 18 September 2009 - 09:48 AM

-> IE6.0 not liking it is a real pain though, that's not technically obsolete, and IE <6 OK, but 6 being affected is a real bummer!


QUOTE
As an aside, the next major upgrade I have on my radar screen is to add in some IP tracking and a cookie setting routine into the mix, then setting up a small post sale-routine to tie conversions/sales back to the original hit and the keyword phrase used in the original hit as recorded by GRE.

appl.gif Sounds like a plan to me!

It will certainly give some valuable info on conversions, and drop off.

One thing I noticed I have done vs your version is the additional stuff on the 'version' data.

I've chopped anything off past the part of the Google Version and then added (local) or (global) to the string.

What is that extra stuff? is it relating to adwords/adsense clicks?

I saw it as kinda making the 'Google Version' data a bit messy, but wondered if this is really usefull info, perhaps we need an addtional column for it?

Your help understanding the data and where it comes from would be real handy smile.gif

A couple of examples i have in my data is..

QUOTE
co.uk/url (Global)
co.uk/cse (Global)


#28 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 19 September 2009 - 10:23 AM

They've been moving stuff around a bit so I've been a bit afraid to chop off any data until they settle on a specific string order. Though that does seem to be getting more consistent lately.

As to what some of the codes mean, I believe "cse" stands for Custom Search Engine in Google lingo. I've not tested it specifically, but that's what comes to mind. Of course CSE's can be used on any site. So it could be a custom search engine on the target site itself, or it could also be one on an Adsense site too I suppose.

The variable to look for to see if it's a regional or global search for non-google.com searches is the variable named "cr".

Hopefully Google will do us all a favor and offer a roadmap to what the various variables mean once they've settled on what will and will not appear in the referral string. Hey, we can hope anyway!



#29 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,160 posts
  • Location:Worthing - England

Posted 20 September 2009 - 07:52 AM

Wow now there's an optomist if ever I saw one lol.gif

Well like you say there is always hope. As long as it isn't fool hardiness.

I got the 'cr' bit for working out 'global' vs 'local' just couldn't work out the whole bunch of other stuff as it varies so much, it makes the version 'selector' look ugly as well as way too many 'versions' , If it is adsense etc. then you would still want to know which relative G! search was applied and if set to local search, but a separate section for the G! HTTP_REFERER as it were. so we could give info such as Adsense search , or perhaps even website hosting the adsense. But if we get too complicated, we'll end up writing our own Analytics program. Is that your ultimate Goal?

#30 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 20 September 2009 - 09:29 AM

QUOTE
But if we get too complicated, we'll end up writing our own Analytics program. Is that your ultimate Goal?


Definitely No!

I don't have a goal with it. Just noticed it so decided to see what could be extracted. But if it gets too deep it destroys the thought of keeping it open source. A person would have to start charging just to justify development and support costs. Plus as a general rule I don't feel it's right to charge for something that depends so heavily on what someone else does or how they do it. That's a recipe for disaster IMHO.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

SPAM FREE FORUM!
 
If you are just registering to spam,
don't bother. You will be wasting your
time as your spam will never see the
light of day!