Are you a Google Analytics enthusiast?
More SEO Content
Urchin Reporting: Downloads Vs Requested Files
Posted 01 February 2007 - 12:00 PM
I'm trying to understand the difference between two numbers Urchin is providing for PDF files to most accurately report the number of downloads/requests/accesses/views (whatever you want to call it) of the PDFs.
I usually view this data by accessing the Downloads Report described as: "This report ranks the popularity of all Downloads on your site by number of Hits (requests) and relative percentage. A Download is determined by the file extension and Urchin's configuration settings.
I then viewed the "Requested Pages" Report described as: "This report ranks the popularity of the Pages (HTML files, generally) visited on your site by number of Pageviews and relative percentage."
To me, those two descriptions mean the same thing. However the data is widely different. The Downloads number is 1/3 of the Requested Pages number for the exact same file name during the exact same time period.
In the past I've been reporting to the client the numbers in the Downloads report but if they're only reporting 1/3 of the PDF views, I know which one I'd rather present.
Anybody have any intelligence on the difference between these two reports?
Posted 01 February 2007 - 08:54 PM
The short version is that the Downloads number is going to be the most accurate for you to use with your PDFs. So you've been using the right stat.
Okay, here's the difference. And this is for the Installed version of Urchin 5, though I think Urchin 4 was pretty much the same. I have no idea if GoAn is the same or not, but it could well be.
With Urchin it comes down to the Response Code the server is sending, in conjunction with how the Acrobat browser plugin works in many, many browsers. Here's the lowdown for what you're seeing as best as I can explain it.
When the Acrobat Browser Plugin accesses a multi-page document it will often only initially only grab the first page or the first X number of pages of the document. It's nothing the server is doing, because the server is delivering the entire PDF document and also delivering a 200 OK response as it should. The Plugin however will only really grab the first page or first few pages so that people don't end up waiting forever for what could be a very large download.
Then as the user scrolls down the page the Plugin will go back to fetch another page or X number of additional pages. This causes most servers to reply with a 206 Partial Download response because at least the beginning has already been downloaded and often the end is not yet being downloaded.
Got that so far? The first retrieval of a multi-page pdf document is 200 OK. Everything after that is 206 Partial Download as far as the server is concerned.
Now let's get into how Urchin differentiates their Download stats as compared to their Requested Pages stats.
For something to show up in Urchin as a Download the hit requires a server status code of 200 (OK), 302 (Found) or 304 (Not Modified). Any other response code for any hit will not show up as a Download since all other status codes at least hint at that the request was not successfully completed.
Requested Pages in Urchin on the other hand will be reported where the status code is any of the 2xx series status codes, plus the 302 and 304 status codes. The issue here is that it doesn't limit itself to just 200 OK, which it really shouldn't since most 2xx status codes are really Silent responses because the request has been confirmed to have been completed as far as the server is concerned. So for Request Pages Urchin will also record a hit for 201, 202, 203, 204, 205 and the 206. And of course the Acrobat Plugin will use 206 for pages in a multi-page pdf document.
This is the reason the Requested Page number will normally be larger than the Downloads number, especially where multi-page pdf documents are concerned. Each Download of these multi-page pdf documents may end up producing 1 or 2 or 110 206 status code hits, which of course inflates this number as compared to the Downloads number.
It's one of those things where there's nothing wrong with how Urchin is doing things or how Acrobat is doing things. Technically speaking both are doing the right thing with the various server status codes. But when you combine the two it can sure as heck get confusing!
Does that help? Or just confuse the situation more?
Edited by Randy, 01 February 2007 - 09:02 PM.
Posted 02 February 2007 - 11:34 AM
Posted 02 February 2007 - 11:59 AM
Especially since it means Jill can now put the down.
Posted 06 September 2007 - 08:34 PM
Do you know if a "download" using the filtering method will give me an accurate count. GA calls this a "pageview". If not, what's the best way to get this data from the new GA?
Posted 07 September 2007 - 05:38 AM
Anybody else know how GoAn handles these PDF docs? I could upload a PDF to that personal site and see if I can discover anything by comparing its stats to the raw logs, or by enabling my installed Urchin on the domain too. But if someone else already knows or has gotten an answer from GoAn support it'll be a lot easier and faster.
<edit to add>
I just found a document in GoAn's Help section that indicates you may need to set up special tracking code on links to the pdf file Rosemary. It's here and gives a pretty strong indication that the GoAn version doesn't work anything like the installed Urchin version where PDF and similar files are concerned.
Posted 07 September 2007 - 10:53 AM
That's right, GoAn is scripped based, so for anything that isn't a html file you will need to add tracking code to the links. This will only give an indication of the number of downloads, especially if there are external websites linking to the file without using your tracking code.
Posted 07 September 2007 - 02:08 PM
We have indeed included the appropriate text but my real question is: Given there is no report to measure downloads (like in Urchin and other programs), I'm doing a manual filter of content for " .pdf " files on the Top Content pages. Is this going to give me accurate data?
Posted 08 September 2007 - 04:45 AM
What it won't show you --unlike the Installed version of Urchin that works off of the actual log files of the site-- is when someone goes directly to the pdf file or clicks on a link from someone elses site that does not include the tracking code. As MaKa pointed out.
Posted 08 September 2007 - 07:52 AM
So, what is your favorite web stats program?
Posted 09 September 2007 - 07:54 AM
The installed version of Urchin, with their UTM and E-commernce modules.
It's the only general web stats package I've used on my own sites for several years now. But that's probably simply because I'm really, really comfortable with using it and because I get a heck of a deal (costs me basically nothing) through my server guys.
I'll freely admit there are other analytics software products out there that are just as good with the data they provide. It's one of those Old Dogs/New Tricks things for me & Urchin.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users