Are you a Google Analytics enthusiast?
More SEO Content
Robots.txt Not Working?
Posted 20 May 2004 - 06:29 AM
I did a allinurl this morning and indeed the site is in Google's index, although it just shows the link to the URL and no title or description for the page (as it does for all my other pages that are not blocked in the robots.txt file).
Is this what is supposed to happen? Are they essentially just recognizing that the page exists b/c it is linked from other pages but then not going to index the content or attributes of the page that would cause it to be provided as search results??? This is my first time ever blocking a file this way so I'm not sure how Google handles it.
Posted 20 May 2004 - 07:53 AM
If you went the other way, using the robots meta tag, the spider would be permitted to go to the document in question, but when it got there it would be explicitly told not to index it. I expect that would keep it from showing up at all.
Posted 20 May 2004 - 03:15 PM
But you also have to give it time (or, more accurately, timing). When you exclude a directory in robots.txt and then create a new page in that directory, you are following a specific sequence. However, there is no guarantee that a spider will follow your same sequence, and indeed, it's highly unlikely. Google doesn't check your robots.txt before grabbing every page, or even before every visit. The only way to truly know something is excluded is to add it to robots.txt and then wait until the spider gets a fresh copy of the file. Only then can you safely create a new page and know it won't be indexed.
I would guess you and Googlebot are simply out of sync with each other. Give it time and the spider should adjust.
Posted 21 May 2004 - 06:04 AM
What I sometimes do, if I do not want somebody finding a page with a link to a document, then I don't allow a specific page to be indexed, and the folder in which the document is. Some documents are not meant to be used by the general public that doesn't know what the page topic is about. Some of my clients are very specialised, and only theyr own staff and direct clients have use for some documents.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users