Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo
- - - - -

Is Google Site: Command Still Best Way To Check Indexed Page Count?


  • Please log in to reply
13 replies to this topic

#1 ttwblb

ttwblb

    HR 4

  • Active Members
  • PipPipPipPip
  • 132 posts

Posted 19 March 2010 - 06:26 PM

Is the Google site: command still the best way to check how many pages Google has indexed for a particular domain? Or are there other more accurate ways to do it?

Thanks.


#2 Jill

Jill

    Recovering SEO

  • Admin
  • 32,913 posts

Posted 19 March 2010 - 08:23 PM

It's my preferred way.

But you can also submit XML sitemaps to Google Webmaster Tools. In fact, Vanessa Fox recommends multiple sitemaps for each section of your site. That way you can see what's being crawled (or not) for each section individually.

I haven't tried this yet, but it sounds like a great idea for a very large, dynamic site.

#3 ttw

ttw

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 371 posts
  • Location:San Mateo, California

Posted 20 March 2010 - 09:01 AM

QUOTE(Jill @ Mar 19 2010, 06:23 PM) View Post
But you can also submit XML sitemaps to Google Webmaster Tools. In fact, Vanessa Fox recommends multiple sitemaps for each section of your site. That way you can see what's being crawled (or not) for each section individually.


Jill: I have done a side-by-side comparison over a 6-month period for the number of pages indexed from the XML sitemap as reported in GWMTs versus the data from a "site:" search. The "site:" data is ALWAYS higher.

I'm assuming that this is due to the XML file not including every single file that Googlebot finds such as PDFs. The site I did this experiment on is large so it's not feasible to try to figure out what's missing.

Has anyone else tried this?

#4 Schevikhoven

Schevikhoven

    HR 2

  • Members
  • PipPip
  • 17 posts
  • Location:Finland

Posted 20 March 2010 - 01:22 PM

I just took a peek at webmaster tools, and it tells me that there are a total of 1200 url indexed from the sitemap. When i use the site: command on google i get 42500 pages, from which about 35 765 are in the sitemap. The ~10000 urls are forbidden by robots.txt but google doesnt seem to remove them from it's index.

The google site: seems to get a realistical result, but it makes me wonder what the webmaster tools numbers are about then.

When i made a few queries on google, "site:domain.fi/ fi/product" and "site:domain.fi/en/product" i noticed there are only 8000 items in the en index, and 15000 items in the fi index. We link heavily to the finnish side but there are only a handful of links to the english side. Maybe this has an effect too despite all the pages are in the sitemap.

Edited by qwerty, 20 March 2010 - 02:33 PM.


#5 Mooro

Mooro

    HR 4

  • Active Members
  • PipPipPipPip
  • 157 posts
  • Location:Loughborough, Leicestershire

Posted 21 March 2010 - 02:45 AM

I look at the site: operand every so often but it bounces and dances to a tune I can't work out.

WMT gives more stable data.

@ttw - Yes, I've gone down the multi map route, my site has over five million pages. I've broken the maps down to the sections of the site so I can see which is getting indexed the best and which section is struggling.

It makes for interesting reading.

#6 EddyGonzalez

EddyGonzalez

    HR 3

  • Active Members
  • PipPipPip
  • 79 posts

Posted 23 March 2010 - 05:39 PM

I agree with Mooro I can't get stable data from the site command. Usually if I run a site command and then start to click next page a few times in the results, the number of results changes quite a bit.


#7 Jill

Jill

    Recovering SEO

  • Admin
  • 32,913 posts

Posted 24 March 2010 - 07:45 AM

The site: command usefulness is not really to find the exact number of pages indexed. I don't really see why that's important.

The idea is to do spot checks on what IS indexed and see if it makes sense. If the number of pages indexed is way more than you know you even should have on your site, you're likely getting duplicate content indexed. If the number is way too small, you likely have crawling and/or PageRank issues.

Exact numbers aren't necessary, nor all that useful, imo.

#8 EddyGonzalez

EddyGonzalez

    HR 3

  • Active Members
  • PipPipPip
  • 79 posts

Posted 24 March 2010 - 04:29 PM

In case anyone is interested, Ann Smarty wrote an interesting article on the site operand a few months ago http://www.searcheng...accurate/14519/

#9 Michael Martinez

Michael Martinez

    HR 10

  • Active Members
  • PipPipPipPipPipPipPipPipPipPip
  • 5,063 posts
  • Location:Georgia

Posted 24 March 2010 - 04:37 PM

The site operator is still useful. Too many people in the SEO community fail to look beyond the first page of results and they get upset when their expectations don't match reality.

There is no 100% effective way of determining how many pages of a site are indexed by Google, but drilling down as far as you can go -- and using the site operator on sub-sections of your site -- is more reliable than the data you'll get from Google's Webmaster Tools and other reports.

Take any report with a grain of salt, and don't try to compare them to each other. You need to establish a baseline with a tool you feel comfortable with and just follow that. You may not be using the best tool at any time, but if you're using a tool you like you can still make adjustments to your optimization if you're not happy with what you see.

#10 Jill

Jill

    Recovering SEO

  • Admin
  • 32,913 posts

Posted 24 March 2010 - 08:55 PM

EddyGonzalez, not really sure why that article is useful as all it does is say that people are confused by the site: operator.

Michael, in his post after yours, has it exactly right. It's a tool, like any other tool. And it provides valuable information if you know how to use it.

#11 Mhoram

Mhoram

    HR 4

  • Active Members
  • PipPipPipPip
  • 114 posts
  • Location:Quincy, Illinois, USA

Posted 25 March 2010 - 10:32 AM

Am I the only one who finds the data in Webmaster Tools to be almost entirely inaccurate and useless? It's fine for some obvious things, like telling you your sitemap is unreachable or that the most significant keyword on your site is suddenly 'viagra.' But for details, it stinks. When it shows under 'top search queries' that my site ranks #4 for a particular keyword, it may actually rank anywhere between #1 and #50. It says a particular page has 45 incoming links, but when I click on the 45, it shows 3. In the diagnostics it lists bad links that were fixed months or sometimes years ago.

It's free, so I can't really complain of course, but I'm surprised it's not up to the level of quality of most of Google's offerings.

#12 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,294 posts
  • Location:Columbia, SC

Posted 25 March 2010 - 11:07 AM

I suspect that is intentional. If they wanted to show you details at a minute level, they certainly could.

#13 Jill

Jill

    Recovering SEO

  • Admin
  • 32,913 posts

Posted 25 March 2010 - 11:29 AM

QUOTE
Am I the only one who finds the data in Webmaster Tools to be almost entirely inaccurate and useless?


Nope. I feel exactly the same way.

#14 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 25 March 2010 - 01:37 PM

QUOTE
Am I the only one who finds the data in Webmaster Tools to be almost entirely inaccurate and useless?


Nope, not at all.

I literally review WMT data maybe once or twice per year for each of my sites that are signed up for an account. But no more. And I rarely bother to take the time to force myself to look more than once per year if I haven't already seen something in my web stats that indicates there may be a problem.

Stats I glance at weekly and do a full workup on monthly. For every site that's important to me. Webmaster Tools .... bleh, not so much. And not at all except when I am trying to track down something that's not quite as it should be in a perfect world.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

SPAM FREE FORUM!
 
If you are just registering to spam,
don't bother. You will be wasting your
time as your spam will never see the
light of day!