Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Prevent Se Spider From Indexing A Website Page


  • Please log in to reply
7 replies to this topic

#1 blue

blue

    HR 3

  • Active Members
  • PipPipPip
  • 77 posts

Posted 01 September 2003 - 01:37 AM

Hello,
I am looking to prevent SE spiders from crawling and indexing one of my pages on my web site. Does anyone know of anything i can put in my HTML to do this? I ran across this, <meta name="MSSmartTagsPreventParsing" content="TRUE">
Will this prevent an SE spider from crawling/indexing a page? Any input is appreciated.
-Blue

#2 Mel

Mel

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 353 posts

Posted 01 September 2003 - 02:51 AM

Hi Blue:

You can put a robots tag in your head section of the page you do not want to be spidered like this:
<META NAME="ROBOTS"' CONTENT="NOINDEX">

or there is a special tag that only works for Googlebot:
<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">.

#3 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,315 posts

Posted 01 September 2003 - 08:50 AM

Your best bet is to use the robots exclusion in your robots.txt file. Check out RobotsTxt.org for more details.

The Meta robots tag has been known to be not always obeyed by the spiders.

Jill

#4 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 01 September 2003 - 08:56 AM

I ran across this, <meta name="MSSmartTagsPreventParsing" content="TRUE">
Will this prevent an SE spider from crawling/indexing a page?

No.

You should either use the robots meta tag (as described by Mel above) or the robots.txt protocol. Details of both can be found at http://www.robotstxt.org/

#5 blue

blue

    HR 3

  • Active Members
  • PipPipPip
  • 77 posts

Posted 01 September 2003 - 03:49 PM

robotstxt.org was very helpful. This is what i am going to place in my header tags,

<META NAME="ROBOTS" CONTENT="NOINDEX">

<META NAME="ROBOTS" CONTENT="NOFOLLOW">

<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">

Is it ok to use all three, or will it cause a problem?

#6 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 01 September 2003 - 04:17 PM

If you're going to use a robots meta tag, I believe the correct version it to combine your first two examples:

<meta name="robots" content="noindex, nofollow">

#7 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,315 posts

Posted 01 September 2003 - 04:41 PM

Be sure to use the robots.txt also.

Jill

#8 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 01 September 2003 - 04:48 PM

robotstxt.org was very helpful. This is what i am going to place in my header tags,

<META NAME="ROBOTS" CONTENT="NOINDEX">

<META NAME="ROBOTS" CONTENT="NOFOLLOW">

<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">

Is it ok to use all three, or will it cause a problem?

You just need this:

<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">

Forget the Googlebot tag. That does something slightly different (it allows indexing but disallows the cached page from being seen. You want to disallow indexing.)

The Meta robots tag does not prevent robots accessing your page, it prevents compliant search engines from indexing your pages - a subtle but important difference. Use the robots.txt file to prevent compliant robots accessing your pages.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users