Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Robots.txt Page


  • Please log in to reply
11 replies to this topic

#1 extremecoupons

extremecoupons

    HR 2

  • Active Members
  • PipPip
  • 45 posts

Posted 25 March 2004 - 11:18 PM

I have a robots.txt page on my server. Is this ok? There is nothing in the actual file, but I was told about 2 years ago to put one on my server or root directory. Thanks

#2 bobsledbob

bobsledbob

    HR 3

  • Active Members
  • PipPipPip
  • 102 posts
  • Location:Ogden, Utah, USA

Posted 25 March 2004 - 11:28 PM

It doesn't matter either way. Having a blank robots.txt file is the same as not having one at all.

There was some speculation at one point that you should always have a robots.txt, but this is bad information I believe. If you had to have a robots.txt file before a search engine would index your site, there would be a whole lot less sites in the search engine databases.

#3 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 26 March 2004 - 08:42 AM

Having or not having a robots.txt file will not have any bearing on whether your site gets crawled or not.

However from the server environment side of things it's something I always recommend you have, even if the file is blank. If for no other reason than having one keeps your server from having to cycle through its 404 Not Found routine. Not a huge deal by any stretch of the imagination, and won't affect your ranking one iota, but it's still the proper approach to take in my view.

#4 balz

balz

    HR 2

  • Active Members
  • PipPip
  • 43 posts
  • Location:California

Posted 26 March 2004 - 03:19 PM

I went ahead and added a robot.txt and also a favicon simply so I won't have to weed through all the 404 errors that are generated by not having them in my stats page.

Let me emphasize that these errors ONLY show up on my stats page and the lack of a robots.txt or favicon do not affect the user/customer in any way when they browse the site.

b.

#5 flipper

flipper

    HR 2

  • Active Members
  • PipPip
  • 45 posts
  • Location:UK

Posted 26 March 2004 - 07:13 PM

Would the above robots.txt file simply stop all the robots listed from crawling the site? Ore do some of them still crawl anyway? Like the email harvesters for example.
I recently looked at my logs and found one entry as: unknown robot identified as 'crawl'. It takes a lot of bandwidth each day. Does anyone know if this is just one hungry spider or a conglomerate of unidentified robots?

#6 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 26 March 2004 - 10:07 PM

Hard to say for sure without seeing the IP number the "spider" is coming from Flipper. If your stats software simply doesn't know what name a spider is, that might trigger as generic name as you've mentioned. Yahoo! Slurp and MSNbot are new spiders you definitely don't want to be blocking. And they might get reported this way.

The problem with the bad bots --email harvesters and such-- is that they don't normally obey a robots.txt exclusion anyway. So it's kind of a waste to throw them into your robots.txt file. They simply ignore it.

If you're so inclined you can however totally block them from your site via their IP number if you would like. But you have to be a bit careful with that to make sure you're not blocking legitimate users or good bots.

#7 blackpool

blackpool

    blackpool

  • Active Members
  • PipPipPipPip
  • 190 posts
  • Location:Blackpool England UK

Posted 27 March 2004 - 06:19 AM

I think not having a favicon is a good idea as the 404 error you receive tells you instantly that someone has saved your site, so it becomes a "nice" error and saves you having to remember to look in your stats to see if someone likes your site.

#8 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 27 March 2004 - 11:18 AM

You can see the same thing if you do have a favicon.ico file on your server BP. Simply look at how many times it's been called in your web stats.

It's not an exact measurement of course, but that will get you as close as seeing how many times the file wasn't found.

#9 josh1r

josh1r

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 321 posts
  • Location:New York

Posted 31 March 2004 - 04:07 PM

I was just reading the Google Adsense FAQ and is says, "If you have a robots.txt file, you'll need to remove it or add the following two lines to your robots.txt to allow our content bot to crawl your site: User-agent: Mediapartners-Google*
Disallow: "

Is that right? From my understanding unless you specifically block a spider in your robots.txt then it will be allowed. So why would google be saying that you specifically have to allow it? Or am I misreading something?

Thanks.

#10 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 31 March 2004 - 04:24 PM

Maybe they mean that if you have a directory that you'd already disallowed all user agents from by using the * wildcard, you needed an entry to specifically allow Mediapartners in.

Either that, or it's just an error on their part. I have an AdSense account on a domain with a fairly long robots.txt, with no mention of that bot at all, and I've had no problems with the spider getting in and serving the ads.

#11 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,312 posts

Posted 31 March 2004 - 07:19 PM

Josh, could you please point us to where it says this at Google?

Thanks!

Jill

#12 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 31 March 2004 - 10:58 PM

It's at https://www.google.c...nse/faq#basics2




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users