Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo
- - - - -

Does Google Index At End Of Month?


  • Please log in to reply
14 replies to this topic

#1 Desktop

Desktop

    HR 2

  • Active Members
  • PipPip
  • 16 posts

Posted 19 December 2003 - 05:26 PM

I read somewhere that Google does its indexing regularly and at the end of the month. True?

I just finished a few changes to my site -- when could I expect to see those changes reflected in my Google listing?

#2 OldWelshGuy

OldWelshGuy

    Work is Fun

  • Moderator
  • 4,713 posts
  • Location:Neath, South Wales, UK

Posted 19 December 2003 - 05:30 PM

As far as I know and have experienced, google can come visit most anytime, and it no longer stores files for weeks on end to index them.

a few weeks ago, i posted a new site on the friday at about 10.30 pm, it was spidered and indexed by early Sunday morning.

So it really is anyones guess.

#3 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 19 December 2003 - 05:41 PM

They used to have the Dance every four or five weeks, during which all the new and updated information would be shared across the data centres, but nowadays they're just in a constant state of flux, so you never know.

#4 seoRank

seoRank

    HR 2

  • Active Members
  • PipPip
  • 46 posts

Posted 23 December 2003 - 09:55 AM

I believe they have 2 kind of crawls. One regular as before (once a month, towards the end of the month) and another constantly crawling news sites and sites with ever-changing 'fresh' content. The constant crawl freshbot is also called everflux.

I believe sites above PR7 qualify to be patronized by everflux.

#5 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,293 posts
  • Location:Columbia, SC

Posted 23 December 2003 - 11:08 AM

That information is outdated- Google has moved to a rolling update for the most part.

And everflux seemed to favor sites with content that changed frequently, not PR7 and above sites.

SEORank, you seem to have quite a bit of outdated information based on a number of your posts here- you may want to read up a little more and get up to speed on the latest knowledge. You may have been reading older information elsewhere. :thumbup:

#6 Grumpus

Grumpus

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 786 posts

Posted 23 December 2003 - 11:41 AM

Funny you should ask. I just posted this a little while ago in answer to why a site vanished. The Google History will also answer this question:
http://www.highranki...t=0

Thanks for keeping my word "Everflux" alive, Scottie. Since the constant crawl/updates started, there's been no real need to use it anymore. Nice to hear it one last time before it heads off to the mock-terminology graveyard. :thumbup:

G.

#7 seoRank

seoRank

    HR 2

  • Active Members
  • PipPip
  • 46 posts

Posted 25 December 2003 - 03:41 AM

Thank you scottiecl for your advise. I respect it. I'll surely make an effort to read up more on SEO. I do realize that it takes me forever to keep up with the pace of information, considering that I tend to analyse and 'validate' each theory before I jump on believing it, just because a 100 other SEO say it to be true. You know how the algo rules of secrecy work leading to so much speculation.

However, I wasn't too sure which of my posts you are refering to. If you can draw my attention to it, I'd be delighted to give you a detailed reasoning behind the logic presented. I would at least stand corrected.

Getting back to the thread.

Consider this -

The way Google server architecture is built, it has (about) 10,000 'Pentium' servers spread across the world in various IDCs. Each time a fresh crawl happens, some 80GB of data is syncronized with these servers. 80GB is no small data to be pushed to these '10,000' servers. This takes several days and you notice Google Dance.

Everflux on the other hand crawls and syncs a much smaller amount of data making the update sync feasible. The very purpose of everflux is to offer mint-fresh content to the users.

So what is the everflux criteria?

We develop/maintain/SEO some several hundred websites. A lot of these have 'daily' changing content while their PR spreads from PR0 to PR8. I have noticed that content from PR7+ sites invariably gets picked up in a few days while the others have to wait several weeks although most sites are already deeply indexed. I do not think this is just coincidence.

I personally feel that issue about the information is not how outdated it is, but how correct it is.

#8 powerofeyes

powerofeyes

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 1,123 posts
  • Location:INDIA

Posted 25 December 2003 - 05:48 AM

I believe sites above PR7 qualify to be patronized by everflux.

Hello SEORank,
There is never a criteria like this when it comes to Google, Most of the websites we handle are PR4,PR5,PR6.
And most of them gets crawled on every visit by googlebot, Infact one Ecom Site Googlebot crawler spends more than 20 minutes on every visit(almost everyday or alternate days), Like this I have many site with these kind of examples,
And one site which is has about 70 pages and almost 50 pages of them gets crawled on every visit by googlebot the site has only PR5 and we have not changed any content on the site for more than 3 months,
Infact 90% of the sites we handle sees a deepcrawl and all updated pages in index within 2 weeks, And most of the sites have only PR of 3 or 4,

Everflux on the other hand crawls and syncs a much smaller amount of data making the update sync feasible. The very purpose of everflux is to offer mint-fresh content to the users.

This is not true, Infact more than 30 of the sites we handle gets deep crawled on every visit by googlebot and most of them the content is as old as 2 years,
It all depends on the linking structure and the sites which link to you, the quality of those sites etc, And never depends on the PageRank, And OH forgot to mention one site which we got recently it has PR0 and has 26 pages, And after quality links from sites it also gets deepcrawled on every visit by googlegot, it should be with good PR now will be seeing only after next month update,
Nice thing is we see fresh dates on the SERPs on every visit by googlebot for all the pages on the site( also with fresh changes we make with pages), And the site came to us only before 3 weeks, So what is the criteria for every flux, cannot define but there is something which googlebot sees on a site which makes it like that site very much,

VIJAY,

#9 Grumpus

Grumpus

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 786 posts

Posted 25 December 2003 - 06:46 AM

Vijay's mostly right. The confusion on his end, as is often the case, is in the terminology. Google never really tells you what it's doing, so it usually comes down to a bunch of folks like us chatting it up and making up words (like "Everflux" and "freshbot").

When Vijay says that he's "deep crawled" every day, he means that his site is crawled "deeply" or, in other words, Googlebot comes by nearly every day and grabs a lot of pages from his site. In reality, though, there is no such thing as the "deep" and "fresh" crawls anymore. (Though, there has been another new change in crawling behavior of late - not sure what it is though).

seoRank - if this were May or June of 2002, you'd be dead-nuts on with your assessment of how Google works. Back then, Google was doing a fresh crawl to a very limited number of PR7 (I believe they started with 8 or 9, actually) sites. In June or July of 2002 (I forget -was early summer, though) Google lowered its requirements and freshcrawled just about everyone with a halfway decent site who updated with at least some frequency.

Again, I'd like to direct you to the link I posted right before your last post. It gives a sort of timeline of the past couple of years and the terminology we used at the time. "Everflux" as we used it originally, no longer exisits. Deep and Fresh Crawls are no more - there's just a spider out there crawling all the time. All crawls are all fairly consistently updated and we haven't seen the old monhtly updates (dances) since much earlier this past year.

Whether this recent update was just a reshuffling of data for the new funk Google is doing or whether we'll start to see regular updates again remains to be seen. We're in the middle of another major evolution at Google. The major changes, in the past, have all freaked everyone out for about a month and then things go back to normal until the next major change.

Have a look at that post I linked to and keep in mind that places like this archive things for years and years. HighRankings (the forum) hasn't been around all that long, but there is still some old information here. I suspect you got your information (based upon some of the words you use) from some of the older discussions at Webmaster World. That's cool and it's good to understand the history and evolution of it all to keep it in perspective, but keep an eye on the dates (every post on every forum has a date on it). In SEO, if the post is over a month or two old, it may or may not have some accuracy. If it's a year old, you might as well take the information and put it in that box in the back of your closet with your Atari Game System.

G.

#10 powerofeyes

powerofeyes

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 1,123 posts
  • Location:INDIA

Posted 25 December 2003 - 07:24 AM

When Vijay says that he's "deep crawled" every day, he means that his site is crawled "deeply"

Yes Grumpus, That is what I meant, I never tried to define "deep" all I meant was "crawling deep in to the site" that's all,
I agree with you all along your post, Good post,

VIJAY,

#11 seoRank

seoRank

    HR 2

  • Active Members
  • PipPip
  • 46 posts

Posted 25 December 2003 - 08:25 AM

  There is never a criteria like this when it comes to Google, Most of the websites we handle are PR4,PR5,PR6. And most of them gets crawled on every visit by googlebot, Infact one Ecom Site Googlebot crawler spends more than 20 minutes on every visit(almost everyday or alternate days)



Its nice to know that most of your sites get crawled by Google everyday. However, I have different experiences. We created an experimental site about 6 months ago and submitted it to Google, also linked it from prominent sites. In the first month (of Google finally recogonizing the site), it only indexed the home page. That was 2 months back. Last month, it indexed 48 links and 'learned' 23 other links to internal pages. It has now indexed the above 71 pages. The site has over 6,000 pages and we have yet to see Google get to the rest with this 3rd month running. BTW, AltaVista has already indexed 3269 of these pages 2 months back. Whatever your definition/experience of the daily/alternate crawl is, doesn't seem to work for this site.

And the fact that Googlebot is spending 20 mins on your ecom site is probably because your site may be database driven. Googlebot does not want to hit your database too hard and bring your server down. Between each hit, its probably crawling other parts of the web, 'cause if it spends all of 20 minutes on your site, it will be some years before it crawls the rest of the 3 Billion pages.

This is not true, Infact more than 30 of the sites we handle gets deep crawled on every visit by googlebot and most of them the content is as old as 2 years



Seems like there is a contradiction with what scottiecl and Grumpus are saying


It all depends on the linking structure and the sites which link to you, the quality of those sites etc



Doesn't Google call this PageRank?

All crawls are all fairly consistently updated and we haven't seen the old monhtly updates (dances) since much earlier this past year.



There seems to be contradiction here with your other post you refer to grumpus. You seem to say that Google has thousands and thousands of times more data than the 312 MB game. How come you agree with the daily/alternate 'total' crawl and update theory at the same time? Or have I missed something here?

Besides the issue is not just about the 'ability' to crawl the 3 Billion pages. Its also about analyzing the 60 billion links within those, their anchor text and 100 other page ranking crietiria, not all of which (I believe) are done on the fly.

#12 powerofeyes

powerofeyes

    HR 7

  • Active Members
  • PipPipPipPipPipPipPip
  • 1,123 posts
  • Location:INDIA

Posted 25 December 2003 - 09:03 AM

if it spends all of 20 minutes on your site, it will be some years before it crawls the rest of the 3 Billion pages.

I think you are a bit ignorant on this, Have you ever analyzed how many crawlers google or Inktomi Has, infact I cannot count, most of the sites we monitor we use Deep Matrix live stats and we see live view of what is googlebot, Inktomi and other robots are doing in our client sites,
And for your information, Google and INKTOMI has uncountable no of crawlers( sorry I dont know the exact number) spidering the web, everytime we monitor we see a different crawler, it looks something like this for google,

Crawler10.googlebot.com, crawler2.googlebot.com etc etc, Google WAP search : proxy.google.com

For INKTOMI search it looks like this,
lj1139.inktomisearch.com, lj1118.inktomisearch.com etc etc,


QUOTE 
It all depends on the linking structure and the sites which link to you, the quality of those sites etc



Doesn't Google call this PageRank?


PageRank is a rank given to your page depending upon the linking structure, inbound and links from other sites, Grumpus referred to a PageRank document in an other thread, you will get a good understanding if you read it,

Besides the issue is not just about the 'ability' to crawl the 3 Billion pages. Its also about analyzing the 60 billion links within those, their anchor text and 100 other page ranking crietiria, not all of which (I believe) are done on the fly.

Believe it or not this is how inktomi(slurp), google,fast, scooter works they dont need months to analyze a site data to rank, All they need is possible a week or two, they have sophisticated technology to handle all these stuff,

VIJAY,

Edited by powerofeyes, 25 December 2003 - 09:10 AM.


#13 Grumpus

Grumpus

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 786 posts

Posted 25 December 2003 - 09:15 AM

seoRank - PR is critical to get a big site crawled deeply. Since you say you have inbound links though, it leads me to believe there is something mechanically wrong with your site. I'm pretty good at spotting what's wrong in this area. If you give me the URL of the site, I'll be happy to go have a look. Most of the time it's one minor thing that's fouling the crawl.

G.

P.S. I prefer to do things like this publickly because others can learn from our findings.

#14 qwerty

qwerty

    HR 10

  • Moderator
  • 8,287 posts
  • Location:Somerville, MA

Posted 25 December 2003 - 09:18 AM

Vijay, the question really isn't about how many spiders a given search engine has. As I understand it, they are doing you a favor by intentionally spidering dynamic sites more slowly, because they don't want to overload your server with requests.

They base this on the structure of your URLs. If they've got parameters, the assumption is made that the site is database-driven, so they slow down the rate at which files are requested.

#15 seoRank

seoRank

    HR 2

  • Active Members
  • PipPip
  • 46 posts

Posted 25 December 2003 - 02:46 PM

I think you are a bit ignorant on this, Have you ever analyzed how many crawlers google or Inktomi Has, infact I cannot count,



Last I heard Googleguy say they had 40 crawlers. But you seem to miss the point. If you believe that googlebot is spending 20 dedicated minutes on your site, then do the math for the rest of the web & you'll know what I'm trying to say.

It all depends on the linking structure and the sites which link to you, the quality of those sites etc, And never depends on the PageRank



Vijay, you missed my point again. Read your above sentence carefully and you'll know what I was trying to say. (BTW, I know what PageRank is)

Grumpus, I'll PM you the URL as I'm not sure I want to list it here yet. But I can tell you these are simple 6000 html pages, nicely linked from home page and links leading from home page.

qwerty, you are dead right on. :)




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users