Our site is structured in such a way that partners/affiliates have imaginary virtual folders i.e. url/partner/page.htm where our system will automatically identify the partner and set various session variables.
What I need to know is whether the robots.txt file can be used to stop spiders recording links in these partner folders as ultimately the page will redirect to just url/page.htm and could trigger some duplicate content warnings as the partner page would have the same content as our standard page.
Would Disallow: /partner/ in the robots file work in this situation or only just for real folders?
Thanks
MadTies
Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
International SEM | Social Media | Search Friendly Design | SEO | Paid Search / PPC | Seminars | Forum Threads | Q&A | Copywriting | Keyword Research | Web Analytics / Conversions | Blogging | Dynamic Sites | Linking | SEO Services | Site Architecture | Search Engine Spam | Wrap-ups | Business Issues | HRA Questions | Online Courses
Controlling Robots.txt To Ignore Imaginary Folders
Started by
MADTIES
, Mar 25 2004 09:21 AM
3 replies to this topic
#1
Posted 25 March 2004 - 09:21 AM
#2
Posted 25 March 2004 - 09:31 AM
It would work in this situation, assuming that when you say "url/partner/page.htm " you mean "domain/partner/page.htm", i.e. that "partner" is a top level directory.Would Disallow: /partner/ in the robots file work in this situation or only just for real folders?
#3
Posted 25 March 2004 - 12:48 PM
I think what he means is that "partner" would be replaced by the actual partner identifier, such that it's different for each affiliate.
Disallowing /url/ would work.
If you can't restrict that far up the directory structure, then you'll need to move your partner level down some. ie.
/url/partners/<partner>/page.htm
Then disallow /url/partners/ in your robots.txt
Disallowing /url/ would work.
If you can't restrict that far up the directory structure, then you'll need to move your partner level down some. ie.
/url/partners/<partner>/page.htm
Then disallow /url/partners/ in your robots.txt
#4
Posted 26 March 2004 - 03:09 AM
an alternative would be to have
url/external/partner/page
then disallow robots access to
url/external
url/external/partner/page
then disallow robots access to
url/external
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users








