Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!


Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 




From the folks who brought you High Rankings!

- - - - -

Large Site Only 5% Cached

  • Please log in to reply
3 replies to this topic

#1 seyoda


    HR 1

  • Members
  • Pip
  • 2 posts

Posted 21 February 2009 - 09:04 AM

A site I am working on is having a problem getting full indexed in Google.

Here are the basics:
  • Site registered in Jan 2007
  • Updated regularly via a blog linked to from home page
  • Moderate inbound links according to Y! although G shows none (High Qual Links from Forbes.com, Harvard Business School, etc,)
  • Sitemaps.xml file linked to in Google Webmaster tools sees 700k URLs but only lists about 600 as being indexed
  • Non WWW version of site is served up via a 301 and designation as preferred URL in G Webmaster Tools
  • No meta/robot tags that would be blocking things
  • We did some back end changes around Feb 8th and have some URL errors that showed up in G around then, but they have been fixed and no error repoted since.
So all that being said, there also exists an extensive html directory that lists out all 700k vendor pages onsite here:

[Link removed per [url=http://www.highrankings.com/forum/index.php?act=boardrules]Forum Rules[/url]]

From it you can veiw vendors by country or by company name. The first page in each of the two cateogries is picked up by G and the cache shows it is indexing the full page, links and all. However if you are on the country specific page (.e.g France) and you try looking at the 2nd page of companies in France, the page is not bening cached. On the alphabetical side of things you run into the same issue on every page beyond the initial directory of alphabetical listings.

As our site is dependant upon long tail seearches for individual compaines it is essential all 700k vendor pages show.

Any help the community can provide would be greatly appreciated!

#2 Randy


    Convert Me!

  • Moderator
  • 17,540 posts

Posted 21 February 2009 - 10:25 AM

Assuming there are no technical issues, at first glance it would appear that Google has decided in their infinite wisdom that those pages have little or no value. Thus they have chosen not to index them.

Frankly, I can see why they'd decide this, considering there is basically no content on the pages that are not being cached. A telephone number, fax number and address, with no additional content, doesn't not a quality page make.

#3 seyoda


    HR 1

  • Members
  • Pip
  • 2 posts

Posted 22 February 2009 - 05:08 PM

Hey Randy,

Thanks for the feedback I really appreciate you taking a look at this. If you look at the cached version of some of the pages you will see that each tab contains a variety of different information:

[Please stop linking and read the [url=http://www.highrankings.com/forum/index.php?act=boardrules]Forum Rules[/url].

In addition to company details, there are shipping stats and a series of other unique content. I find this to be no different than retailers or sites like classmates.com or pipl.com that have some barebones profile.

If we want to see that these pages start to get indexed, what would you suggest we do? Any insights you can provide that will encourage deeper spidering would be most appreciated.

#4 Randy


    Convert Me!

  • Moderator
  • 17,540 posts

Posted 22 February 2009 - 05:26 PM

Well, again assuming there are no technical issues your only option is to build up a lot more authority for your site so that it has more to pass along to its internal pages.

But you'll want to look at those interim pages too. You've got at least several hundred if not several thousand that at first glance seemed to be between your main pages and pages with actual content.

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

We are now a read-only forum.
No new posts or registrations allowed.