Are you a Google Analytics enthusiast?
More SEO Content
Control Over Robots
Posted 24 August 2004 - 10:45 AM
I know a robot.txt file can keep a robot away from pages that you do not want spidered, but is there really any point for robots commands other than that?
Can robots.txt be beneficial to the speed and depth of a crawl?
Posted 24 August 2004 - 11:01 AM
The one thing I can tell you it definitely does is decrease the number of lines in your error log. The spiders request robots.txt, and if they find it, even if it's a blank file, that's one less 404 error.
Posted 24 August 2004 - 01:54 PM
<meta name="robots" content="index,follow">
If this is too big a question, can you recommend some reading?
Posted 24 August 2004 - 02:07 PM
The best reason to use the meta tag is when you do not want a file to be indexed the but the links to be followed (like when you move a page from one dir to another). When I move pages I move the page, create a page with the old name but put in NOINDEX,FOLLOW so that the next time a spider hits it sees the page, deletes it from the index, but follows the links to the new page.
This is good also when you do not want to exclude a directory but just some pages in there.
Another useful way to use it is when you want a test page out there that you are working and have linked to but not to be spidered yet till it is comepletely "done."
After a while the noindex pages are dropped from indexes then the pages can be deleted thus avoiding a 404 error.
This sometimes takes up to a year for all the web indexes to update and so you can safely drop the pages (if they follow the spider rules).
Posted 24 August 2004 - 09:39 PM
The difference is that the spiders (well behaved ones, at least) are going to ask for the robots.txt file before they start to spider your site. And if they don't find one, that will generate a 404 error in your logs. Enough of them and it starts to get annoying, cluttering up your logs for no good reason.
None of the spiders requests META tags beforehand; the well behaved ones read them when they find them, but they don't go seeking them out separately. So if the META tag isn't there, it won't make any difference in your error logs.
If you're not talking about a blank robots.txt, but rather one that has entries in it, you have a measure of control using the robots.txt file that's more cumbersome to achieve using the META tag. You can easily and quickly specify which spiders you want to allow, which you want to disallow, and which folders and/or individual files you want to allow them to spider or you want them to stay away from.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users