Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo
- - - - -

Googlebot Problem


  • Please log in to reply
23 replies to this topic

#16 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 27 December 2003 - 07:54 PM

Actually, that's the case with this forum as well. There isn't a ton of javascript, but there is an inline style sheet, and it's fairly long. I know Scottie's done a lot of hacking into this forum, but she hasn't changed that.

<added> I take back part of that. There is rather a lot of scripting in the source code. I just looked at the code behind this page, with just the one post in it, and you have to get halfway down before you hit the content.

#17 Grumpus

Grumpus

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 786 posts

Posted 27 December 2003 - 08:02 PM

Yup yup. I never looked at the page source here before. You'll notice that there are only 6,000 pages from these forums in Google. That's a relatively small number considering that I'd assume the base pagerank here is about 5 or 6.

Do these forums get much search traffic on the pages that are in there?

G.

#18 Ron Carnell

Ron Carnell

    HR 6

  • Moderator
  • 959 posts
  • Location:Michigan USA

Posted 27 December 2003 - 11:13 PM

Grumpus, do you have any real evidence that extraneous "bloat" affects spiderability? A lot of people seem to assume a spider likes text towards the beginning of a file, but I've never seen any reason to think that was true. That's a human limitation, not a software one.

As long as the page is less than 100K and all the tags are properly closed, it really shouldn't matter to a software program. I'm not a Google programmer (don't even play one on TV), but it takes just about one heartbeat to throw away everything between <script> and </script> or between <style> and </style>. Whoosh, whoosh, and the only thing remaining is indexable content.

When possible, JavaScript and CSS should be moved to external files for the visitor's benefit. I see no reason to think it should matter to a spider, though.

#19 Grumpus

Grumpus

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 786 posts

Posted 28 December 2003 - 09:08 AM

The evidence I have may be a bit outdated. And, even then, the evidence may have been selective in that while it proved the case for Site A, Site B could have a similar situation and the spider(s) would be fine with it.

For example, some HTML editors (I forget which) tend to add a blank line at the top of a page. Edit the page 5 times and there are 5 blank lines up top. 10 times and there are 10 lines. On several instances, we couldn't find anything else wrong with the page/site, so we removed the lines and the spider was there and gobbling it up within the week. Then again, I've seen pages with lots of blank lines up top that are crawled just fine. What was the difference? I have some guesses, but I'm not sure.

On other sites, the complaint has been that the bot won't crawl - or when it does, it only hits a few pages. Or, it'd take the pages (at a lower rate than expected) but they wouldn't rank well. We cleaned out the crap and there was a marked improvement in ranking and most notably, in depth of crawl.

I haven't really done anything in this realm for some time, so it's likely that my information is dated. Spider capabilities - especially with Google - improve on an almost daily basis. That's why I was asking about the traffic you folks get from Google - in an effort to update my understanding to what's going on today.

G.

#20 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 28 December 2003 - 09:13 AM

I think only an admin could give you that information, Grumpus. And I think they're all away for a few days.

#21 SearchRank

SearchRank

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 2,333 posts
  • Location:Phoenix, AZ

Posted 28 December 2003 - 09:18 AM

When possible, JavaScript and CSS should be moved to external files for the visitor's benefit. I see no reason to think it should matter to a spider, though.

It is also beneficial for the webmaster because it makes modifying these elements alot easier than doing it page to page to page.

Links are also an issue. Google has stated that pages should have no more than 100 links on them and I have heard that they actually prefer pages with links in the 70's or less so I would assume that to have more than this would affect spiderability.

#22 AaronC

AaronC

    HR 2

  • Active Members
  • PipPip
  • 20 posts

Posted 30 December 2003 - 08:27 AM

ok i will try and get that sorted, but looking into my logs googlebot hasnt visited the site since the 21st of decemeber , whats wrong there after being on for non stop like 2 days and then just never coming back.

Im slowly being taken over on the search engine , can you estimate when googlebot will reappear, as there is lots of new stuff for it to index hes just not there to do it!

#23 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,312 posts

Posted 30 December 2003 - 12:59 PM

Grumpus, yes, we get tons and tons of Google traffic to the forum. There's so much traffic here, however, that my logs and stats become a bit unwieldy, so I don't go through them very often. It's just too much trouble.

But I do find that a good amount of traffic finds us through keyword searches at Google.

Jill

#24 Grumpus

Grumpus

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 786 posts

Posted 30 December 2003 - 03:46 PM

Nice. Then we can add "Googlebot can now identify and overcome style bloating" on our list of "Things Google Can Do."

Thanks for the info, Jill.

G.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users