Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!



Photo

How To Block Robots From Indexing Sub-domain With Robots.txt?


  • Please log in to reply
3 replies to this topic

#1 SEMSEO

SEMSEO

    HR 3

  • Active Members
  • PipPipPip
  • 73 posts

Posted 31 December 2008 - 12:53 AM

Hi,

I have a website that is framing content from a subdomain i.e.

example.com is framing content from sub.example.com.


What is the correct way to block robots from indexing these sub-domains?

For example, blocking files in the directories with robots.txt is easy

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/

but how do you disallow a sub domain and various directories and files in this sub domain with robots.txt?

For your info, I am using Joomla and Drupal CMS.


Thanks for your help.

Happy New Year 2009 to all members!


#2 oneofthe3lions

oneofthe3lions

    Paz

  • Active Members
  • PipPipPipPipPipPip
  • 702 posts
  • Location:Spain

Posted 31 December 2008 - 07:03 AM

Each robot will look for the robots text in each individual root. So you need the robots text in the root of the subdomain not just the domain root.

User-agent: Googlebot
Disallow: /

Happy New Year


QUOTE(SEMSEO @ Dec 31 2008, 01:53 AM) View Post
Hi,

I have a website that is framing content from a subdomain i.e.

example.com is framing content from sub.example.com.
What is the correct way to block robots from indexing these sub-domains?

Happy New Year 2009 to all members!





#3 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 31 December 2008 - 08:47 AM

How is the subdomain set up? Is it part of the same web space as the main domain, where a request for sub.domain.com ends up pulling content from a /sub/ subdirectory tied to your main hosting space?

The same would be true if the subdomain has its own unique hosting space. Simply place a robots.txt at the root level for the subdomain and the spiders will take care of the rest.

If that's the case you should be able to simply place a robots.txt file inside this subdirectory and have it applied only to the subdomain. As Oot3l's said, the search engine spiders will request a robots.txt file from each unique domain, and they see subdomains as unique domains.

If you're doing any kind of 301 redirect for the subdomain you may need to do something a bit tricker. For example if your server is set up forward requests for sub.domain.com to sub.domain.com/sub/ via a 301 redirect it can end up messing up robots.txt for the subdomain if you don't give the robots.txt some special abilities. Or set up the redirect to be a transparent redirect.

It should be fairly easy to do in either case, but how to do it may differ depending upon your exact setup, what type of server you're on (*nix or IIS), etc. As a general rule all you need to do is get the robots.txt to sit at the root level of the subdomain address.

FWIW, this is one of those places where Google's Webmaster Tools can come in quite handy. Once you have the subdomain verified you can use WMT to verify your robots.txt is doing what you want it to do.

#4 buckmajor

buckmajor

    HR 1

  • Members
  • Pip
  • 2 posts

Posted 15 December 2010 - 11:29 PM

QUOTE(oneofthe3lions @ Dec 31 2008, 09:03 PM) View Post
Each robot will look for the robots text in each individual root. So you need the robots text in the root of the subdomain not just the domain root.

User-agent: Googlebot
Disallow: /

Happy New Year

Thanks, that was so needed to know. In Google webmaster tools, do I need to add the site url, verify the sub-domain and then add the robots.txt file in the sub-folder?

Much help is appreciated
CHEERS smile.gif





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

SPAM FREE FORUM!
 
If you are just registering to spam,
don't bother. You will be wasting your
time as your spam will never see the
light of day!