Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo

Google Rank Extractor


  • Please log in to reply
98 replies to this topic

#76 Jill

Jill

    Recovering SEO

  • Admin
  • 32,983 posts

Posted 22 November 2009 - 09:45 AM

And did you understand how to set up the database tables? 1dmf posted something to me a few pages back here which worked for me. You have to be able to have phpMyAdmin on your server (well at least that's how I did it).

Where are you stuck?

#77 piskie

piskie

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 1,098 posts
  • Location:Cornwall

Posted 22 November 2009 - 11:40 AM

Yes Jill, I get many searches listed as just "B" which are mainly "B & B somewhere or other"

#78 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 22 November 2009 - 12:31 PM

Piskie: Yes, it's something I plan on finding a fix for. It would be an easy one if the format for the referral url stayed the same so one could perform a look ahead, but it doesn't. The string is slightly different in each browser/OS combo and sometimes different from one datacenter to the next. So a solid solution will take a bit of time, mostly in testing different approaches. Long story short, it's more difficult than it would at first appear. So the attempt is going to have to wait until after things calm down a bit for me. The projects that pay the bills get priority yanno. wink1.gif

If you're familiar with php and want try something yourself to deal with that & character, one way to do that is to perform a bit of string replacement. You'd want to do that in the rank_extractor_ajax.php file, and do it prior the line (around line 58) that says $ref = stripslashes(nl2br($_GET['ref'])); The nl2br bit is what is causing the problem, since it turns the url encoded ampersand character (it's %26 for reference) into an html ampersand. It's there and needed for other reasons. However what it does later triggers the search string chunking in a bad place.

To provide a fix for all possibilities will require more than a bit of time. However I'll try to find some time over the next few days to see if I can come up with a workable hack specifically for the ampersand character for you. It's the one that's going to cause the most heartache without a doubt. In fact, I'll try a little something, something right now since I have an hour or so this morning to play in the code sandbox.

Tomsk: The basic steps for a default install are...

1. Create a MySQL database as normal.

2. Open the gredb.php file. Put connection information for the database you created in there. While you're there also change the $req_pos_data to "false" instead of "true"

3. Upload all of the files in the GRE zip file to your server, keeping the same structure.

4. Add the following line to every page you want GRE to activated on:
CODE
<script language="javascript" src="/gre_ajax.js"></script>


5. Sit back and let it start collecting data. You can check what it's doing by browsing to the address of www.yourdomain.com/gre/


An extra step that would be wise to do is to password protect the new /gre/ subdirectory that was created when you uploaded the files. Not strictly necessary, but you probably won't want your competitors to have access to the data. Most hosting control panels these days provide a way to password protect directories.


#79 Jill

Jill

    Recovering SEO

  • Admin
  • 32,983 posts

Posted 22 November 2009 - 12:38 PM

I'm attempting to rewrite the readme directions since Randy knows what he means by steps 1-5 above, but they're not as obvious to those who don't install things on their server very often (if at all).

#80 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 22 November 2009 - 12:43 PM

Another quick update. Working with the same two files as before.

This one incorporates the nrp toggle update from yesterday, and also will print to the screen the total number of referrals, number of nrp (No Ranking Position) and rp (Ranking Position) hits, as well as putting it into a percentage for easier consumption.

Three things to note:

First, the NRP data only gets collected if you change the default setting in the gredb.php file of $req_pos_data = "true" to "false" This is a hard and fast requirement.

Second, the NRP stuff and stats will only show up in the results if you choose No from the Exclude NRP Hits drop down. Default setting is Yes, so to see the percentage of referrals with the ranking position data you'll want to toggle it to No.

Third, the RP/NRP stats analyze the dataset and date range you've chosen for each run of the tool. Thus if you were not collecting NRP hits before and include these in your date range it's going to skew the number. To get an accurate percentage make sure you're using only a date range after the $req_pos_data was set to false.

In checking on my test sites for this update they're showing roughly 16-20% of all Google referrals now contain the GRE enabled data.

#81 Jill

Jill

    Recovering SEO

  • Admin
  • 32,983 posts

Posted 22 November 2009 - 01:12 PM

Cool thanks! I'm showing 16% for today.

Ok, so now what we need is a way to export the data, and a way to perhaps delete older data from the database. Should I be concerned that this database is just going to keep growing? This site does see a lot of Google traffic.

#82 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 22 November 2009 - 02:34 PM

An export feature is already in the thought process. lol.gif Just need time to integrate and test it, and you know I never have any free time during the holiday season. I've not used the script I normally utilize to create csv data dumps with ajax content before, so I have no idea if it'll work or not. Murphy's law says it probably won't without some additional code tweaking. I figure I'll probably need to end up creating an AJAX version of the data dumper, or just create a link that does the same database search in straight PHP and save it that way.

On the db size question, yes it could become a concern with really busy sites. I hesitate to integrate a db cleaning function into the script itself. Too easy for someone to click on the wrong thing and wipe out all of their data, plus there would be some issues without the Sort By Ascending -> Sort by Descending is currently coded to work. It actually looks at the ID number of the db entry rather than doing a much more complicated sort with the date format, so removing old records would essentially break the sort function.

My feeling is that you're simply better off to do a db dump to save the old data via phpMyAdmin, then empty all of the tables to let GRE start building it from scratch again. Best of both worlds that way.

Going back a bit to your other question about those strange looking single letter or single number entries in the search phrase area. Those aren't really related to Piskie's ampersand issue, but something I have on my radar too. It has to do with how some browsers report themselves and how GRE breaks up the referrer string.

In essence GRE simply looks for a string that is exactly equal to the string q= since that's what Google uses to delineate the query part of the referrer string. That's all well and good and makes sense.

The problem is that some browsers --not many thankfully-- also include this q= string as part of their browser identification string. Technically it usually comes through as q%3D where %3D is the hex code equivalent to an equal sign, but they both work out to be the same thing after normalization. And usually it's not just q=, but something like aq=

When you see those it means another q= has shown up in the referrer string before the query one does.

Bottom line, it's going to take a code rewrite to account for these thankfully rare inaccurate routines. Basically I'm going to have to revamp how the full referrer string is chunked up and add a bit of REGEX into the mix so I can use a Begins With type of test. That's the only way I can think of to to dump the false positives and let the script zero in on the correct section of the referrer string every time.

The logic works, in my head anyway. Now I just need to find the time to code it and test it. Well, check that. Technically I just need time to test it. I already have a possible solution to deal with these strange entries coded and in live testing on one of my sites. If anyone else is feeling especially frisky today and wants to test it I'm sure the additional exposure and feedback will help to bring the revamp from alpha to released stage exponentially faster. So anybody who wants to play guinea pig please let me know. I'll send the code changes over to see if it sorts the strange entry issue.

#83 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 22 November 2009 - 02:41 PM

Piskie:

If you feel like testing something grab the files below. I just tweaked 'em a bit to deal with your ampersand issue, but have only tested them on one site and one server. It's very, very alpha! So make sure to make a backup of your current files before replacing them on your server!

FTR, since this includes tweaks the same two files as in the NRP update we've been discussing the past two days, plus one more file, this also includes that update. rank_extractor_ajax.php goes at your root level. The other two go in your gre admin area.

#84 Jill

Jill

    Recovering SEO

  • Admin
  • 32,983 posts

Posted 22 November 2009 - 09:22 PM

Randy, don't go to any trouble on the other issue, it's no big deal. Very few queries. It's actually something I deal with on our CMS as well which has a tracking mechanism on a page-by-page basis. It must gather the query strings in a way similar to yours.

#85 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 23 November 2009 - 05:15 AM

Like you said, it's rare. But it is there and being a perfectionist... giggle.gif

The first stab at new code is already done and being tested. As usual I'm seeing a little something-something that needs to be worked out yet. It's better than before, but not perfect yet. I'll have to think on that one for a little bit to see if there's an easy way to account for it.

#86 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,167 posts
  • Location:Worthing - England

Posted 23 November 2009 - 06:00 AM

QUOTE
It's pretty simple. A new field has been added to the admin form. It's a simple Yes or No choice and only appears in the admin form when $req_pos_data is set to false.
You know when I added this to the Perl version, I forgot to only show this on the Admin form if the config setting was collecting all hits. ohno.gif

I'm not sure what the ampersand issue is, as the perl version doesn't seem to suffer from it. I get all searches with ampersand correctly recorded and you can also search using the ampersand and it will retireve data accordingly. Is this specific to the browser being used Randy?

The 'funny' characters is an issue, and the referrer string problem, so I look forward to seeing your code Randy wink1.gif

The DB size is certainly an issue and i'm not sure what can be done especially for sites which have 10's thousands of hits a day, that soon adds up to millions of records, so I think perhaps for some sites the tool is not usable, how do sites with millions of hits and pages manage their analytics?

Randy -> Let me know when you have all the updated code and new features, so I can incorporate them into the perl version.

Jill -> Let me know when the muppets guide is ready, as I will need to amend relative to the perl version.










#87 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 23 November 2009 - 06:36 AM

The ampersand in the search query is basically an issue of 1. how it gets encoded by Google in the url string and various browsers, and 2. to some degree who the combo of php and mysql handle ampersand characters. Throw in the mix that JavaScript deals with some of these special characters differently than either php or mysql and you have a headache. lol.gif So it may or may not be an issue with the perl version. The easiest way I've found to test for it is to use Firefox with the refspoof add-on. That let's you load in basically any referral string you want to test.

Regarding the code tweaks, they're all trivial in terms of writing line after line of code. Each is going to deal with a specific issue. On the strange searches getting recorded the best solution I see is to force the code to look for the very specific string of [/b]&q=[/b] to delineate the start of the search phrase portion of the url. I've seen lots that are close, but none are exactly &q= or the url encoded version of the sam. The original code was looking for q=, so it was matching on stuff like &aq= and &oq= and so on with an extra character or two in there. Those aren't needed for anything we're trying to do.

QUOTE
how do sites with millions of hits and pages manage their analytics?


I can only speak for myself. The site is on several servers and the analytics are on a completely different server. Sites with that many hits can afford multiple servers. wink1.gif



#88 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,167 posts
  • Location:Worthing - England

Posted 23 November 2009 - 07:03 AM

I've done some basic testing, like going to goole searching and clicking my SERPs, then checking the tool, ampersand is important to me, as it's used in one of my genres (Drum & Bass), so far, my testing seems OK, but i have limited it to using IE, so further testing is required.

The query var is an issue in my code, but so far i've only had one search string not be supplied

hmmm, my code is looking specifically for q=
CODE
        # Set to N/A if missing
        if(!$vars{'q'} || $vars{'q'} eq "" || $vars{'q'} eq " "){
            $vars{'q'} = "N/A";
        }
I do set it to N/A if it's missing, but that's a fudge rather than a fix, so the fusion charts didn't balk on the search phrase reports.

As I look speicifcally for q=, do you know why the search phrase may have been missing, is it possible it wasn't supplied by G! or more likely something else throwing the capture screwy?

As for the DB thing, i was considering having a config setting, you could set how many months of data you want the DB to keep, and do a removal, or even an auto export to CSV, so the data is never lost, just archived somehow, in perl it's easy enoguh to play with the epoch time stamp, so the math is pretty easy for record selection.

Any thoughts on this? would be good if we could ensure conformity of solution across versions, so your input is appreciated. Remember not all of us have multiple servers, some of us have little ol shared hosting and limited bandwidth / storage sad.gif

Edited by 1dmf, 23 November 2009 - 09:11 AM.


#89 madams

madams

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 504 posts
  • Location:Costa Blanca, Spain

Posted 23 November 2009 - 08:27 AM

Hi

With the issue of the database bloat, is it not possible to delete all entries in a DB table using the date.

i.e. delete all entries in [Table Name] before 2009/10/1

Seems possible, no?

#90 1dmf

1dmf

    Keep Asking, Keep Questioning, Keep Learning

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,167 posts
  • Location:Worthing - England

Posted 23 November 2009 - 09:14 AM

madams, the removal of the records is the easy bit, it's how we want to enable user configuration via the config file , plus if archiving is to be implemented.

I think 12 months worth of data is probably fine for many sites, with an option to set this from 1 - 12 (or higher, 24, 36 etc..), not sure how usefull an exported CSV for archiving would be to people, so any input on this feature would help.

We also need to consider when the archiving / removal is triggered.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

SPAM FREE FORUM!
 
If you are just registering to spam,
don't bother. You will be wasting your
time as your spam will never see the
light of day!