Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

Frames + Robots.txt


  • Please log in to reply
7 replies to this topic

#1 marthijn

marthijn

    HR 2

  • Active Members
  • PipPip
  • 25 posts
  • Location:near Rotterdam, Netherlands

Posted 23 February 2004 - 03:27 PM

Hi ya'll

here my question:

i made a copy from one of my old websites because the customer didn't want to spend too much money.

It is build with frames.

I alos put a robots.txt file in the root that contains dissallow rules to all but the index.html page.

I put quit some information in the noframes tag and got very good rankings.

Now the weidest thing has happend.....google index two pages that should not be indexed because my robots.txt says dissallow.
I'm pretty sure no website links to those frames eihter.

what could be wrong??


Greetz,

marthijn

#2 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 23 February 2004 - 04:14 PM

This is probably nothing to do with frames.

Sometimes Google indexes a URL without indexing the content at the URL. It sees a link to a URL and indexes it. You can tell pages indexed like this because there is...

a) no title
b) no snippet
c) no cache
d) no page size and
e) no date indexed

...available on the page's listing on a SERP. Only the URL is listed.

When Googlebot attempts to read the page in order to index the content, it will see that the page is protected by robots.txt and remove the URL from its index.

#3 marthijn

marthijn

    HR 2

  • Active Members
  • PipPip
  • 25 posts
  • Location:near Rotterdam, Netherlands

Posted 24 February 2004 - 12:42 PM

When Googlebot attempts to read the page in order to index the content, it will see that the page is protected by robots.txt and remove the URL from its index.


The robots.txt file was there from the beginning because i know those things happen.

They got indexed anyway!!

#4 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 24 February 2004 - 01:07 PM

I've lost you. You will need to post the URL, I think.

#5 marthijn

marthijn

    HR 2

  • Active Members
  • PipPip
  • 25 posts
  • Location:near Rotterdam, Netherlands

Posted 25 February 2004 - 05:09 AM

Ok, i'll try to make things clear:


I created a webpage and used frames.
I made a robots.txt file so no frame pages get indexed. I only want the index.html indexed. And nothing else.

so i wrote in the robots.txt rules like this:

Disallow: /basis.html
Disallow: /bottom.html
Disallow: /bottom1.html
Disallow: /center1.html
Disallow: /frames1.html
Disallow: /gerard.html
Disallow: /hoe.html
Disallow: /info.html
Disallow: /knoppen.html
Disallow: /links.html
Disallow: /links1.html
Disallow: /linx.html
Disallow: /main.html
Disallow: /rechts.html
Disallow: /rechts1.html
Disallow: /sendmail.html
Disallow: /top.html
Disallow: /top1.html
Disallow: /waar.php
Disallow: /wat.html
Disallow: /wie.html
Disallow: /vloeren.html
Disallow: /images.html

It went well at first, but later google indexed:

rechts.html
links.html
main.html

I get this result now:

http://www.google.nl...=UTF-8&filter=0

This realy kills my rankings!!

The robots file was there from the beginning.

The url is [http://www.vloerenleggersbedrijf.nl]

I hope you can help me!

Thnx in advance.

#6 bkernst

bkernst

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 385 posts
  • Location:Cape Town, South Africa

Posted 25 February 2004 - 05:39 AM

Maybe you should consider adding the robots metatag to all the pages that should not be indexed, what has been indexed will take a while to be dropped.
On the Google website in the webmasters section, there is an exact description on how to stop GoogleBot specifically. It will mean updating every page once, but it should work. I suggest that you use robots.txt only for the folders you do not want to have indexed.

Bernhard

#7 Alan Perkins

Alan Perkins

    Token male admin

  • Admin
  • 1,559 posts
  • Location:UK

Posted 25 February 2004 - 05:45 AM

These are the four pages Google has indexed:

www.vloerenleggersbedrijf.nl/main.html
Supplemental Result - Similar pages

www.vloerenleggersbedrijf.nl/rechts.html
Supplemental Result - Similar pages

www.vloerenleggersbedrijf.nl/links.html
Supplemental Result - Similar pages

Parket schuren, parket onderhoud. - Vloerenleggersbedrijf Gerard
Vloerenleggersbedrijf Gerard. Wie we zijn en wat we doen. Niet voor
niets hebben wij achter ons logo staan: verstand van vloeren. ...
www.vloerenleggersbedrijf.nl/ - 10k - Cached - Similar pages


Note how the first three results have

a) no title
b) no snippet
c) no cache
d) no page size and
e) no date indexed

As I indicated earlier ... this content is not indexed. Googlebot has not crawled those links and should never. The links have simply been seen.

This realy kills my rankings!!


I can assure you, it's not this that's killing your rankings. It must be something else.

#8 marthijn

marthijn

    HR 2

  • Active Members
  • PipPip
  • 25 posts
  • Location:near Rotterdam, Netherlands

Posted 25 February 2004 - 12:07 PM

Ah, i get it now.

thnx for the reply's.

My search goes on then...... :)




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users