Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo
- - - - -

Database Size Vs. Bleeding Page Rank


  • Please log in to reply
14 replies to this topic

#1 Robmuller

Robmuller

    HR 3

  • Active Members
  • PipPipPip
  • 50 posts

Posted 08 December 2007 - 08:03 AM

Hi

I run a global directory that offers information to business travellers on a city level (e.g. restaurants in Barcelona).
-My site is in 8 languages and offers 4 categories per city (e.g. hotels, restaurants, car rental). About 100 cities are represented.
- At the moment about 4,000 suppliers (e.g. restaurants) are represented in the directory. They all have their own page within the directory, this page is available in 8 languages.
- Every city has a "overall" page (with all the 4 categories) and a page only displaying one specific category (e.g. restaurants in Barcelona). This means that there are 5 categories per city.

When you multiply this, the database exists out of 40,000 unique pages ((100 cities x 8 languages x 5 categories = 4,000) + (8 languages x 4,000 suppliers)).

To the best of my knowledge the number of pages of a domain is part of the Google algorithm. This would be an argument to offer all pages to Google for indexing. The downside is that the page rank of the (home)age(s) bleeds over the 40,000 pages. From a page rank point of view I think it is smart to link to the 100 "overall" city pages in the English language, since people who are searching do this in English and on a city level (e.g. "Barcelona restaurants). In that case I would protect the rest of the site for Google (e.g. via java scripts). This means that Google only indexes the 100 most relevant pages and devides page rank amongst them

My question: what is wise? Let Google index all pages or only offer the pages I really like Google to index and rank high?

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 08 December 2007 - 08:40 AM

It depends upon a lot of factors Robmuller, including which pages you want to have a chance of showing up in a search.

If you need those lower level supplier pages to appear in a SERP when someone searches for them, you'll definitely want them indexed. Personally I wouldn't wall those off because they're an excellent opportunity to target very specific, long tail search phrases.

At the end of the day you're really asking about how to sculpt your link popularity or PageRank. Goggle nowadays says tells you to use nofollow to do this. We've discussed this a few times 'round the forums and there are views on both sides. Personally, I'm not yet convinced it's the wisest thing to do. Especially considering all of the search engines seem to treat nofollow a little bit differently. So doing something for the way one engine treats it in order to gain an advantage could have unintended negative effects on another engine. It's just risky when there's no standard method of handling on the search engines side of thing.

Plus what happens if people start linking to your interior pages --which is a goal you want to shoot for-- but you've told the search engines that you don't consider those to be important pages of your site via nofollow? Nobody knows.

At the end of the day you're not going to be linking to all 40,000 interior pages from your home page anyway. So if you construct your internal navigation so that it's sensible for real users, you're also sculpting your internal link popularity/PR. Call me old fashioned, but I like to stick to the tried and tested methods rather than jumping on each new theory or bandwagon that comes along. If for no other reason than that brand new things typically change several times before they become tried and tested.

#3 Robmuller

Robmuller

    HR 3

  • Active Members
  • PipPipPip
  • 50 posts

Posted 08 December 2007 - 03:13 PM

Hi Randy

Thank you for your reply.

QUOTE(Randy @ Dec 8 2007, 08:40 AM) View Post
It depends upon a lot of factors Robmuller, including which pages you want to have a chance of showing up in a search.

If you need those lower level supplier pages to appear in a SERP when someone searches for them, you'll definitely want them indexed. Personally I wouldn't wall those off because they're an excellent opportunity to target very specific, long tail search phrases.


To be honest I'm mainly interested in a high rank for the 'city' pages. At the moment most city pages rank within the top 20 when you search Google with the relevant keywords. I rather get a higher ranking for them than rank for all possible 'long tail' searches. My question is really about quality vs quantity: will my 100 'city pages' rank higher when all 40,000 pages are included or will they rank higher when I restrict indexing of the site and link only to those 100 pages from the homepage and 'hide' the rest of the sites to SE, for example by using java commands so that my page rank is distributed only over relevant pages?

By the way: my biggest competitor who really has excellent (top 3) rankings for most of the searches (searches with more than 200,000 results), has only 712 pages indexed!

#4 nethy

nethy

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 974 posts

Posted 09 December 2007 - 07:36 PM

About targeting the long tail.

actively 'optimising' for long tail keywords on 20 or so pages is good but its not really the same as what Randy is talking about (I presume). They would have to be short (statistically) tailed enough for you have a clue that they are even being searched for (ie be picked up by word tracker, keyword discovery, get enough adwords searches for you to pick up) the really long tail is terms that may only be searched once, ever. Or only bring you a handful of searches per month. Several sites I am involved with end up picking up almost half of their organic traffic from phrases with terms that only bring in under 5 visits per month and maybe 20% from terms that only bring 1 visitor. Thats the really long tail. And its not possible/practical to 'optimise' for these individually.

If you have 40,000 pages with unique content you have a chance at picking up a lot of these one off search terms. Particularly if the content is contributed by individual vendors.

In your shoes, I'd probably consider going down the conventional route of trying to get as many pages indexed as possible. Then I would try and determine how many searchers land on internal pages (meaning thay are getting SE traffic). Then you can make a more informed desicion about the potential costs/benefits of 'shaping' your PR by excluding internal pages.

One thing to consider is that with such a big site, unless it has a lot of link strength, getting SEs to take internal pages seriously is enough of a challenge. So, trying to keep them indexed but restricting their page rank with nofollows will probably not be possible.

#5 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 10 December 2007 - 09:37 AM

You presumed correctly nethy. wink1.gif

I'd move forward exactly at nethy laid out. Try to get as many pages as possible indexed and then spend some time tracking down which inner pages are getting SE traffic and try to optimize them for the longer tail terms that are going to bring you some traffic.

More qualified traffic is always better than trying to sculpt PR in my book. Because as mentioned above these longer tail terms typically make up a statistically significant portion of my traffic. No one phrase does, but when you add all of those together it's certainly worth the time investment. Especially when they're so darned easy to rank for!

I won't even mention that these more specific, longer tail terms tend to carry a several times higher conversion rate than my more generic phrases. angel_not.gif

Edited by Randy, 10 December 2007 - 09:43 AM.


#6 Robmuller

Robmuller

    HR 3

  • Active Members
  • PipPipPip
  • 50 posts

Posted 10 December 2007 - 02:02 PM

I'm thinking about a work around that avoids ' bleeding page rank' while also indexing the ' long tail' pages within my site. I'm wondering what your opinion is about it.

My idea: put only links to the city pages on the homepage. Block the rest of the links on the homepage (contact info, FAQs) with java commands (onclick). This will block Google from indexing these pages. Do the same thing (onclicks) with links from the city pages (e.g. restaurants in Barcelona) to detail pages (e.g. a specific restaurant in Barcelona). In this structure Google cannot not index the detail pages, so no pagerank is being leaked to the detail pages.

To let Google index these detail pages, I'm thinking of setting up a sitemap with links to the detail pages. This allows Google to index these detail pages. I think this has another advantage: from the navigation bar in every detail page (e.g. home --> barcelona --> restaurants --> King Juan Carlos) you can create an internal link to the city page (in this example the word 'restaurants'). You can include a relevant alt tag to this link (in this case 'barcelona restaurants).

#7 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 10 December 2007 - 10:59 PM

Why, why, oh why would you not want to index every page of your site? I will never understand why anyone would want to do that. Any reason anyone has ever given just makes no sense whatsoever.

Instead, create your site architecture in such a way that your important pages are where people can find them easily. Yeah I know...what a concept!

#8 Robmuller

Robmuller

    HR 3

  • Active Members
  • PipPipPip
  • 50 posts

Posted 11 December 2007 - 01:51 AM

QUOTE(Jill @ Dec 10 2007, 10:59 PM) View Post
Why, why, oh why would you not want to index every page of your site?


Hi Jill

Thank you for your reply. From my point of view in the structure I come up with all pages are actually being indexed, while the page rank of the homepage is distributed over the most important pages (the city pages). By indexing the detail pages via the sitemap and internally link them in one way to the relevant city page I think all page rank goes to pages I really like to rank high in the searches.

From your response I understand you prefer to have a normal SE structure (home --> city page --> detail page). IMHO in that structure the city pages give away a part of their page rank to the detail page. Disclaimer: I learned about the concept of avoiding leaking page rank to less important pages only last weekend. I'm still digesting all the views on it.

#9 chrishirst

chrishirst

    A not so moderate moderator.

  • Moderator
  • 5,886 posts
  • Location:Blackpool UK

Posted 11 December 2007 - 05:18 AM

QUOTE
I learned about the concept of avoiding leaking page rank to less important pages only last weekend. I'm still digesting all the views on it.


there is only one answer to it

It's bull and will just distract you from doing something FAR more useful.

#10 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 11 December 2007 - 09:24 AM

Yep, I completely agree with what Chris wrote.

#11 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 11 December 2007 - 11:50 AM

I hate to Me Too. But me too. giggle.gif

#12 just_me

just_me

    HR 1

  • Members
  • Pip
  • 2 posts

Posted 18 May 2008 - 10:51 AM

QUOTE(Jill @ Dec 10 2007, 10:59 PM) View Post
Why, why, oh why would you not want to index every page of your site?

why does your robots file look like this?

QUOTE
User-agent: *
Disallow: /webtop.log
Disallow: /webtop/
Disallow: /stuff/contentmgr/templates/
Disallow: /forum/index.php?act
Disallow: /forum/index.php?showuser
Disallow: /forum/lofiversion/
Disallow: /pplmg
Disallow: /ngphp
Disallow: /dwgiwlzpn
Disallow: /atlbizchron
Disallow: /isalesatl
Disallow: /seminar/
Disallow: /nittyhra
Disallow: /register/
Disallow: /cc/
Disallow: /atlanta
Disallow: /may03-seo-seminar.htm?c1
Disallow: /gliadel/
Disallow: /currentstats/
Disallow: /msp/
Disallow: /webceo
Disallow: /newsletter/question2.php
Disallow: /forum/lofiversion/
Disallow: /march-training-class
Disallow: /april-seo-training

User-agent: Googlebot

Disallow: /webtop.log
Disallow: /webtop/
Disallow: /stuff/contentmgr/templates/
Disallow: /pplmg
Disallow: /ngphp
Disallow: /dwgiwlzpn
Disallow: /atlbizchron
Disallow: /isalesatl
Disallow: /forum/index.php?act
Disallow: /forum/index.php?showuser
Disallow: /seo-writing.htm?c1
Disallow: /advisor.htm?c1
Disallow: /seminar/
Disallow: /nittyhra
Disallow: /register/
Disallow: /cc/
Disallow: /atlanta
Disallow: /may03-seo-seminar.htm?c1
Disallow: /gliadel/
Disallow: /currentstats/
Disallow: /*getlastpost
Disallow: /*mode=linearplus
Disallow: /*mode=threaded
Disallow: /forum/lofiversion/
Disallow: /msp/
Disallow: /webceo/
Disallow: /newsletter/question2.php
Disallow: /march-training-class
Disallow: /april-seo-training


#13 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 18 May 2008 - 11:49 AM

QUOTE
why does your robots file look like this?


Mostly to stop duplicate URLs from being indexed, as well as parts of my CMS that don't belong in the index.

#14 just_me

just_me

    HR 1

  • Members
  • Pip
  • 2 posts

Posted 19 May 2008 - 12:03 AM

my site has a LOT of pages with long strings, like category.php?id=1234&this=that&that=why&morejunk=bad&whyisthishere=idunno&didtheprogrammerneedtoadd this=NO

I've been trying for the past couple of days to figure out how to remove most of the string or mod rewrite it but haven't been successful yet. Should I no follow those links or use the robots text on them?




#15 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 19 May 2008 - 12:19 AM

Why? Are they duplicates of other pages?

If you want them to stand any chance of getting indexed you certainly don't want to start nofollowing or excluding them via robots.txt




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users