So if you've ruled out any duplicate content issues
I would really appreciate your opinion on the duplicate issue.
Google can see c1150 pages from the site (all apart from the homepage in the supplemental index). Those come up for site:siteur…
If I search for site:www.siteur… Google can see just the homepage and that is placed in the normal index (last updated 12 Feb).
For the supplemental listings I wondered why there were that many pages in Google’s index (since adding all product, review, category and static pages was way short of 1200), so I copied all the SERPs (100 at a time), pasted them in a Word doc and run some searches.
I discovered that Google has cached every product once (c230), every pop-up picture for each product (the ones that pop up when the "click to enlarge" link is pressed - another 230), a review for each product (another 230), all the static and category pages and a few “products_new” pages. Adding those up comes to roughly 750 pages, so there are roughly another 500 pages that Google can see and that I can not account for.
Now those “products_new” pages come from the New Products category which currently contains the entire product range (c230). Once the user clicks on a product the product page loads up. However, I noticed that there is a slight difference here:
Here is how the url of product 167 looks if the user follows the normal path to get to the product page (i.e. category --> sub category --> product): siteurl/product_info.php?cPath=24_27&products_id=167 (lets name this page 167A)
And here is how the url of product 167 looks if the user gets to the product page via the New Products category (i.e. New Products --> product): siteurl/product_info.php?products_id=167 (lets name this page 167B)
Those pages are identical in everything but the url (one contains “cPath=##_##&” – where #=number - the other one doesn’t). Could these be considered duplicates?
The same issue is present with the review pages. If the user clicks on the review link from page 167A he is directed to page: siteurl/product_reviews.php?cPath=24_27&products_id=167
If the user clicks on the review link from page 167B he (or she) is directed to page siteurl/product_reviews.php?products_id=167
Again those pages are identical everything but the url (one contains “cPath=##_##&” – where #=number - the other one doesn’t).
Now these findings mean that since each product and review are counted twice I can now account for the extra 500 pages.
This seems like a duplicate issue; however, I looked at another oscommerce site where the same issue is present, but there are no problems with rankings/inclusion.
What is your opinion on the matter?
Do you think I could prevent Google from indexing the duplicate pages with a robots.txt file containing a disallow: siteurl/product_reviews.php?products_id
command (notice that “=167” is missing from those urls)?
Would it work like that, or would I have to list all the pages including the number (i.e. disallow:
If that would work I would probably block popup images in the same way.
Directories and press releases are not going to save a site no one wants to link to.
All I want the directories for at the moment is to get the site into Google. I will be looking for links elsewhere in the future. However, I am a bit confused with what has been posted here regarding directories. Every link building guide I have read (several from this site), mention directories (there are even guides on how to find niche directories and directories providing free links).
Are you saying that directories are not important (at all) nowadays and that no one should bother?
I mentioned (and insisted upon), directories for a start. I mentioned paid for and reciprocal links from directories since I already tried getting free links with meagre results.
I also noticed that a lot of directories nowadays mask the links they provide (link not showing at the bottom of the browser on mouse over, or link showing like: directory/various_stuff_here/as_seen_on_screen.htm). Do such links count at all?
Please accept my apologies for the length of this post.