Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo
- - - - -

Any Thoughts On Google Sitemap?


  • Please log in to reply
46 replies to this topic

#1 ewc21

ewc21

    Hong Kong SEO

  • Active Members
  • PipPipPipPipPipPip
  • 910 posts
  • Location:Hong Kong, China

Posted 03 June 2005 - 03:04 AM

Does anyone had a look at Google Sitemaps?

www.google.com/webmasters/sitemaps

Probably a good idea so we can better monitor our sites on whether they are crawled by the search engines successfully or not.

Any thoughts?

#2 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 03 June 2005 - 07:29 AM

It's a pretty cool idea, especially for dynamic sites that are having difficulty getting some of their inner pages pages crawled isn't it?

Never tried it myself, but I may when I release my next new site in a few weeks just to see if it makes a difference in how quickly some of those deeper pages get spidered.

#3 Shane

Shane

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 850 posts
  • Location:Atlanta, GA

Posted 03 June 2005 - 07:52 AM

Wow, that's interesting. Any idea how long it's been around?

#4 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 03 June 2005 - 07:53 AM

This is VERY interesting:

QUOTE
7. I have no easy way to extract URLs from my database of dynamic URLs. How can I generate a Sitemap?

You can use any reasonably large access log (i.e., an Apache log) to submit your URLs. The Sitemap Generator allows you to generate a Sitemap from a list of URLs, from your access logs, or by pointing to a directory path hosting static files corresponding to URLs.


From Google Sitemap Help

#5 qwerty

qwerty

    HR 10

  • Moderator
  • 8,295 posts
  • Location:Somerville, MA

Posted 03 June 2005 - 08:38 AM

Would anyone not say that this is at least a step in the direction of Google starting to accept trusted feeds?

#6 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 03 June 2005 - 09:36 AM

It absolutely IS, Bob.

They claim they will always be free though, so it isn't paid-inclusion. But this will go right along with the NPR stuff, etc.

I think it's a great thing, and a great way to get stuff indexed that would otherwise be part of the invisible web. Google is smart to want to index whatever they can get their hands on.

cheers.gif

#7 Bernard

Bernard

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 302 posts
  • Location:Friendswood, TX

Posted 03 June 2005 - 10:00 AM

QUOTE(Jill @ Jun 3 2005, 08:36 AM)
I think it's a great thing, and a great way to get stuff indexed that would otherwise be part of the invisible web.


I always understood the term 'invisible web' to mean content that was password protected or disallowed from spiders via robots.txt. I didn't see where this XML sitemap info indicated that Google would crawl or index content that was password protected or disallowed by robots.txt.

#8 Matt B

Matt B

    The modem is the message.

  • Active Members
  • PipPipPipPipPipPip
  • 558 posts
  • Location:Canton, OH

Posted 03 June 2005 - 10:01 AM

This is a huge step forward - yes I believe it is essentially a trusted feed if they are allowing an apache log to generate URL paths. This breaks open thousands, if not millions more pages into the Google index.
Of course, it does rely on web managers to know this exists.

Are we going to see another round of "submitting my site" type of questions as a result of this? hmm.gif

#9 don1

don1

    HR 4

  • Active Members
  • PipPipPipPip
  • 173 posts
  • Location:Marlborough, MA

Posted 03 June 2005 - 10:28 AM

Here it is in the news: http://news.com.com/...30744&subj=news Just got it off an RSS feed.

#10 qwerty

qwerty

    HR 10

  • Moderator
  • 8,295 posts
  • Location:Somerville, MA

Posted 03 June 2005 - 10:33 AM

I wonder how common it is for a server to have Python installed, since it doesn't look like this will work without it.

#11 OldWelshGuy

OldWelshGuy

    Work is Fun

  • Moderator
  • 4,713 posts
  • Location:Neath, South Wales, UK

Posted 03 June 2005 - 11:02 AM

The snake is back then smile.gif

This is very good news indeed, and is yet another step forward for google. This means that all those great content sites on forums and other dynamic sites can now carry google adsense ads. It also means that G has upped the anti with regard session id's etc. All geat stuff.
Interesting to not (as an aside) that Googleguy has now out and out said that the use of '&ID' in any url effectively renders it useless.

#12 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,325 posts

Posted 03 June 2005 - 11:09 AM

QUOTE
I always understood the term 'invisible web' to mean content that was password protected or disallowed from spiders via robots.txt.


No, although that may be part of it (but I don't even think it is) the invisible web is often content that the engines simply can't index for one reason or another, very often because it's all contained in a database. I don't think that's the only stuff, but that's a big part of the invisible web.

I'm pretty sure that Chris Sherman has written a lot on this subject, and in fact I believe one of his books is in regards to the invisible web.

#13 SpeedyPin

SpeedyPin

    HR 3

  • Active Members
  • PipPipPip
  • 64 posts
  • Location:San Diego, California

Posted 03 June 2005 - 11:37 AM

I've been trying to get in all morning to have a look. Not a chance! LOL

#14 chrishirst

chrishirst

    A not so moderate moderator.

  • Moderator
  • 5,886 posts
  • Location:Blackpool UK

Posted 03 June 2005 - 12:01 PM

the "don't use &id" has been added to the guidelines as well

See this thread

#15 Bernard

Bernard

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 302 posts
  • Location:Friendswood, TX

Posted 03 June 2005 - 02:58 PM

QUOTE(Jill @ Jun 3 2005, 10:09 AM)
No, ... the invisible web is often content that the engines simply can't index for one reason or another, ...


That's what I meant - content that they can't index (because it is restricted in some way) as opposed to just won't index or haven't found. Your original post made it seem like restricted content might be indexed under this new scheme, but I read elsewhere earlier today that the Google Engineer confirmed that Google will respect robots.txt over the XML. I just wanted to draw a clarification.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users