Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Robots.txt


  • Please log in to reply
5 replies to this topic

#1 bumble

bumble

    HR 2

  • Members
  • PipPip
  • 19 posts

Posted 14 April 2004 - 12:54 PM

Hi everyone

A little while ago I created a robots.txt file and put it at the root of my website to stop some webpages being indexed. However, it hasn't stopped the pages from being indexed. Can you confirm that I have the correct format please:

User-agent: *

Disallow: /site_info/contact.htm
Disallow: /site_info/contact_confirmation.htm
Disallow: /site_info/privacy_policy.htm



Many thanks <--This is not in the robots.txt :o)

#2 SearchRank

SearchRank

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,333 posts
  • Location:Phoenix, AZ

Posted 14 April 2004 - 01:00 PM

If you are waiting for those pages to be removed from the index, that may take a while. Have you looked at your logs to see if these pages have been getting fetched? Have you compared the page info in the indices to see if it is current with what you currently have? How do you know that the engines are not fetching these pages is what I am trying to get at.

#3 bumble

bumble

    HR 2

  • Members
  • PipPip
  • 19 posts

Posted 14 April 2004 - 01:06 PM

I had these pages in the robots.txt before I even put these pages online. I know google is constantly indexing these pages. When I do site:www.mydomain.com it shows that the pages were updated only a couple of days ago??

#4 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 14 April 2004 - 01:13 PM

Is that blank line in there after
User-agent: *
like you have it in your post? If that's the case, you need to remove it. Quoting from http://www.robotstxt...sion-admin.html :

...you may not have blank lines in a record, as they are used to delimit multiple records.



#5 bumble

bumble

    HR 2

  • Members
  • PipPip
  • 19 posts

Posted 14 April 2004 - 01:38 PM

Thanks qwerty

I did have a blank line in there. That explains it.

What will happen now? Will google remove those pages the next time it reads my robots.txt?

#6 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 14 April 2004 - 01:49 PM

They should drop out over time, but I don't know how long it will take. If it's urgent, Google has the "automatic URL removal system". I've never used it, but you might want to give it a try.

I didn't find anything similar in Yahoo, except a way to remove a site from their directory, and you obviously don't want that.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users