Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
Robot.txt And Duplicate Content
#1
Posted 12 June 2008 - 12:56 PM
And block www.example.com/catagory/page1 in the robot text file,
And only pages */category/* link to www.example.com/page1, will www.example.com/page1 be blocked from spiders?
This is pretty much what my cart software is doing with all products. I want the spiders to only access www.example.com/foo/page1
not
www.example.com/page1
or
www.example.com/category/page1
Hope that makes scene.
#2
Posted 12 June 2008 - 05:06 PM
You'd be better off excluding both of the pages you want to be excluded via your robots.txt. It's a trivial addition and you never know when cart software might put the other urls in something (even like a feed) or the spiders might discover a link to one of these other pages on someone elses site. Better safe than sorry.
If you wanted to exclude everything in the /category/ directory you can also exclude just the directory. There's an implied wildcard at the end, so it would exclude every page inside the category subdirectory. Though it's not part of the robots.txt standard you can do the same sort of thing with filenames too, if those ones you want to exclude at the root level have something unique in their page names. There you would use the * wildcard character.
#3
Posted 13 June 2008 - 05:26 AM
The cart in question is [removed to protect the innocent and guilty] but they have a real shitty duplicate content problem.
For example see the three URLs to the the same product below (this is from the demo store).
/ink-eater-krylon-bombear-destroyed-tee-1.html
/apparel/shirts/ink-eater-krylon-bombear-destroyed-tee-1.html
/catalog/product/view/id/120/s/ink-eater-krylon-bombear-destroyed-tee/category/18/
On my site, with robot.txt I block /catalog
In this example I would also like to block all products in root, so /ink-eater-krylon-bombear-destroyed-tee-1.html
The problem is that there are to many product to block directly. The products in root are only liked to from /catalog/somthing/somthing/and_so_on
So I'm hoping (not a word I like!) that by blocking /catalog that it will also block the root products. It sound like that is the case form your last post.
(Actually there are other places which I have blocked too but I'm trying to keep thinks simple here)
I'm just hoping that, soon there will be a fix to this. Magento is the best cart software that I have ever come acroos but they need to sort out the duplicat products problem. They are only at version one, so they can be forgiven.
I was thinking of prefixing all product with a code in the URL such as /hhdgf_aproduct. Then blocking anything in root with hhdgf in the URL, but then that seems OTT.
Would it seem that I am doing all I can for now or are there any other solutions?
Edited by Randy, 13 June 2008 - 08:34 AM.
#4
Posted 13 June 2008 - 08:39 AM
This is a case where it's going to be far more desirable to fix the root cause of the problems. Which means getting the software to stop linking to those pages from the category page, and possibly then tweaking the .htaccess so that calls to those pages at the root level produce a 404 or 301.
Have you tried working with the cart manufacturers to see if they have a solution? It really needs to come from there. robots.txt won't be enough for the situation you've laid out.
#5
Posted 14 June 2008 - 04:54 AM
This is a case where it's going to be far more desirable to fix the root cause of the problems. Which means getting the software to stop linking to those pages from the category page, and possibly then tweaking the .htaccess so that calls to those pages at the root level produce a 404 or 301.
Have you tried working with the cart manufacturers to see if they have a solution? It really needs to come from there. robots.txt won't be enough for the situation you've laid out.
Thanks for the info.
The cart manufacturers doesn't seam all that interested, going by the forums. Don't get me wrong the <cart name removed> teem are brilliant. They have created the best cart software I have ever seen and I'm not the only one saying that. It's just the duplicate products that sucks a bit, OK a lot.
I think I'll just try and forget about the problem for now and optimize the the hell out of the category pages.
Edited by Randy, 14 June 2008 - 06:46 AM.
#6
Posted 14 June 2008 - 06:45 AM
It's the same thing every other shopping cart out there has had to go through at one time or another, which is understandable since cart developers tend to be code jockeys and not be SEO's. They all eventually either come around, or they simply disappear because nobody will pay software that has such huge and easy to fix flaws.
#7
Posted 17 June 2008 - 10:55 AM
Now I'm wondering if products need indexing at all.
The shop in question sells greeting cards (real ones not ecards) and searches tend to be more general than specific because people like to brows. For this reason I'm thinking that I may be better to just optimizing the category pages and robot.txt block the product pages. Is this a good or bad idea?
Edited by Randy, 17 June 2008 - 12:23 PM.
#8
Posted 17 June 2008 - 12:28 PM
For instance, if you provide a feed to Google Base/Froogle, I don't think you'd want to then block spiders from your product pages. If those product pages get direct traffic from the search engines, I don't think you'd want to start excluding. Etc, etc.
Have you looked at your web stats to see if traffic is landing directly on these product pages? I suspect the volumes will be quite small for each page, so you'll need to dig fairly deep. However if you take all of those 1's and 2's from the various product pages it can often add up to a significant portion of a site's traffic.
You'll want to be sure before you do something like this. I could have some pretty drastic negative effects if you end up cutting off a good traffic and revenue stream.
#9
Posted 19 June 2008 - 08:48 AM
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users









