Jump to content

  • Log in with Facebook Log in with Twitter Log In with Google      Sign In   
  • Create Account

Subscribe to HRA Now!

 



Are you a Google Analytics enthusiast?

Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE! 

 



 

 www.CustomReportSharing.com 

From the folks who brought you High Rankings!


Sponsored Content

 

 
 

Photo

New Google Patent Application


  • Please log in to reply
30 replies to this topic

#16 laura

laura

    sfgirl

  • Active Members
  • PipPipPipPip
  • 261 posts
  • Location:Beautiful San Francisco

Posted 01 July 2005 - 05:12 PM

Google can probably use info collected from people who have their toolbar installed to find out who is bookmarking.

So then, would bookmarking by clicking a "bookmark this page" link on a page be counted too? (rather than bokomarking by doing it through your browser menu).

#17 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,312 posts

Posted 01 July 2005 - 06:52 PM

Google has a new bookmark tool where they would get that info. I'm not sure if they could actually collect bookmark info from your browser just by having the toolbar installed.

At least I hope not...but maybe...?

#18 xan

xan

    HR 3

  • Active Members
  • PipPipPip
  • 61 posts

Posted 02 July 2005 - 02:38 PM

I blogged about the patent, if anyone's interested. I just don't think it's a big deal or that it undermines any sites either. Its also just an application for now as well.

http://spaces.msn.com/members/search-scien...g!262.entry

Edited by xan, 02 July 2005 - 02:54 PM.


#19 redsonia!

redsonia!

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 470 posts
  • Location:Minnesota

Posted 02 July 2005 - 05:26 PM

Welcome, smakyyy! bye1.gif And you're right. There is a lot of info in this forum. Sit back and try to absorb it all! thumbup1.gif

#20 Bill Slawski

Bill Slawski

    HR 4

  • Active Members
  • PipPipPipPip
  • 117 posts
  • Location:Newark, Delaware, USA

Posted 02 July 2005 - 05:47 PM

Hi Xan,

Might as well include the update to the patent, from this last Thursday, in your blog post:

Systems and methods for determining document freshness

And, as for the bookmarks and favorites, and how Google would get that information, it would possibly be through the bookmark manager described in this followup patent application from June 16th:

Methods and systems for personalized network searching

The Freshrank application above is very short, and informative. The Bookmark manager application comes out a little invasive to me.

#21 BobetteKyle

BobetteKyle

    HR 6

  • Active Members
  • PipPipPipPipPipPip
  • 889 posts
  • Location:Near St. Louis, Missouri

Posted 03 July 2005 - 12:10 AM

QUOTE
The Freshrank application above is very short, and informative.
And "lawyered." zz.gif

wink.gif

So, basically Freshrank would be determined by some combination of the last time the page was updated and/or the freshness of linking documents (the freshness of which, in turn, would be determined in this same way), possibly factoring in the time each inbound link was established? And that would then be incorporated into the ranking algo.

#22 Bill Slawski

Bill Slawski

    HR 4

  • Active Members
  • PipPipPipPip
  • 117 posts
  • Location:Newark, Delaware, USA

Posted 03 July 2005 - 03:16 AM

True, Bobette. Enough can't be said for the patent applications remarkable sleep inducing qualities. smile.gif

And looking at the pictures does help in a lot of these patents. (At least it helps me. )

This followup from Monika Henzinger cuts out all of the personalization language, the spam fighting talk, considerations of domain registration information, popularity trends in searches and links, matches between anchor text and the subject of the content of the page, whether the links are from trusted or authority sites, frequency of changes over time in links and content, and many other issues that were raised in the historical data patent application.

I think the step of going through it carefully, and understanding it makes it a good stepping stone to understanding the longer and more complex application. The longer application has a lot of authors, and this application from Dr. Henzinger reads somewhat as if it might be part of her contribution to that application if it were condensed out of the longer document. Some of the ideas in her stand-alone invention appear to be expanded upon a lot in the longer piece.

The bookmark manager application is co-authored by another one of the inventors listed in the Historical Data application, Steve Lawrence. If a bookmark management program like the one described there is developed, it would make the "User Maintained/Generated Data" section of the Historical Data patent application more likely to become reality.

#23 robwatts

robwatts

    HR 5

  • Active Members
  • PipPipPipPipPip
  • 308 posts
  • Location:London - Hertfordshire

Posted 03 July 2005 - 05:31 AM

I think that reading these patents can give some valuable insights into IR as a whole and certainly provoke a thought or two as to other things that we should be considering when we work with a domain.

That said, many of these patents ( and they seem to pop up almost daily these days) tend to cover so many bases that if you took too much on board and believed everything that was written then you'd be forgiven for starting to get a little confused as to what was doing what where and when in any algo process.

What is apparent in a stating the obvious kinda way, is that technology is progressing and the SE's are certainly making use of these improvements and that greater use will most certainly be made of user behaviour data. How this is used of course is the million dollar Q, just because a patent says x y or z does'nt mean that the answer is laid out for all to see, its simply an indicator of what they might be doing with something and how it could be applied. I do smile at the various posts that seem to imply that because its been written (patent) it must be what they will be using soon, when the facts are, that they could be using any number of things, and are certainly not going to be telling us about it! I think it might just be disinformation thing, throw in more stuff, muddy the waters a little more, marginalise SEO. Freshrank, Trustrank, Pagerank, Linkrank, HITS ...

Am I a little thick in my belief that the purpose of a patent was to protect some intellectual idea/design. The funny thing with regard to an SE algo type patent is that realistically, how could anyone really really know, without gaining access to the algo itself, whether competitor new-search-engine.com was using the patent described or not, in other words, what is the point in filing the patent application, when for all intents and purposes its more or less impossible to police. What am I missing here?

Is it the case that individual aspects of the patent, eg the mining of bookmark data for example would effectively say, that no other company could ever use this particular method in any algo ranking score? If so, then this seems a little severe to me, if not down right cheeky even, and certainly relegate my disinformation suspicions biggrin.gif

#24 xan

xan

    HR 3

  • Active Members
  • PipPipPip
  • 61 posts

Posted 03 July 2005 - 07:17 AM

QUOTE(bragadocchio @ Jul 2 2005, 11:47 PM)
Hi Xan,

Might as well include the update to the patent, from this last Thursday, in your blog post:

Systems and methods for determining document freshness

And, as for the bookmarks and favorites, and how Google would get that information, it would possibly be through the bookmark manager described in this followup patent application from June 16th:

Methods and systems for personalized network searching

The Freshrank application above is very short, and informative.  The Bookmark manager application comes out a little invasive to me.
View Post


Hi there,

I am aware of new developments, but I hardly ever include anything that hasn't been implemented and tested, or peer reviewed (i.e.computer scientists at conferences, checking prototypes first hand and so on). A lot of my readers are computer scientists, and writing about patent applications and things which have not yet been verified wouldn't be in line with expectations, and I would quickly get lots of mail about it! Its fine to think about what this could mean for the future of search and so on, but the thing is that it is hearsay, and fabrication until checked first hand.

I also had a read and thought that it was again nothing new. It shows that there is work going on in page freshness, and personalization, but I am sure that you already new that, so it doesn't change much as far as new information goes really.

It does say " The present application is a continuation-in-part of U.S. application Ser. No. 10/748,664 (Attorney Docket No. 0026-0058), entitled "Information Retrieval Based on Historical Data" and filed Dec. 31, 2003", and this easily pointed towards this.
Corpus freshness is an issue which is not new, and has been worked on a great deal. The bigger the corpus, the less it is fresh. The reason this method has been worked on is because different documents have a different score of freshness. For example 6months old news is too old, but a 6month old document on new physics experiments would be considered recent. There's the issue of things that were considered old new becoming relevant again as well amongst other things.

A workshop from 2002 gives a good overview of the main problems in IR, including this problem of freshness and a simple way, but drafted by some pretty good people in the field:
http://sigir.org/for...challenges2.pdf

The bookmark thing is being done by a lot of different companies and you'll see a lot more of these being used for different things. Personally, I am not too concerned about them collecting my info. As long as it makes things easier for me to do, cool.

IBM was granted 3,277 patents in 2004 alone. I could be wrong but I think if I remember righty that Google had 7. MSN filed 2000 in 2004, and expect to file 3000 in 2005 (check shareholder reports). My point is that many other companies are investing in mass innovation, and MSN will pull some nice things out of the bag. Some of the things in the Google applications are not really ground breaking, but sensible. It does give you an idea of where things are heading for sure. To me the other things that Google have released are much more impressive and interesting. I do understand however that they don't have anything to do with SEO. My concern is simply that people forget about the rest of the industry and this means that they might not be prepared for changes, through focusing on smaller things. No doubt things will change with Google IR methods, and I would hope so, or they would quickly go out of business!

I won't always blog about these things, but it doesn't mean that I won't discuss them in forums. The blog isn't especially SEO oriented or for Uni students or computer scientists either, but just for anyone interested in any aspect of IR. I don't make a special effort to cater for any demographic group in particular. This is because I think that if you are in the business of IR, you should embrace all aspects, they all intertwine!

It is interesting to me to read your responses to changes like that and also to discuss them with you, so thanks! I enjoy contributing to the forum when I can, and appreciate what I learn here.

#25 Randy

Randy

    Convert Me!

  • Moderator
  • 17,540 posts

Posted 03 July 2005 - 07:56 AM

Very well said Rob.

I too wonder how much of these patents are purposely designed to be part of a disinformation campaign. The search engines surely realize that they do not live in a vacuum, and know people read those patent applications trying to discern what is happening today and may be happening tomorrow.

Honestly, much of the overly general stuff included in some of those patent apps seem silly to me. Almost as bad as other absurd patent apps claiming some right to control license to methods of swinging on a swing. hysterical.gif

#26 Jill

Jill

    High Rankings Advisor

  • Admin
  • 32,312 posts

Posted 03 July 2005 - 08:51 AM

Well, I've never been a big fan of reading patents and other technical documents, however, when I read that last one which I wrote a bit about a few weeks ago, I enjoyed it because there was a lot of common sense in it as far as what Google were currently doing. Stuff that we had been saying in the forum for years was basically verified by stuff that they could possibly be doing according to the patent.

That was cool!

Next up will be for me to write about the TrustRank paper. That was another really interesting one with cool implications for site developers and SEOs.

#27 sonnyyu

sonnyyu

    HR 4

  • Active Members
  • PipPipPipPip
  • 157 posts

Posted 03 July 2005 - 09:48 AM

Hi Xan;-

It is very informative post and very nice blog. Thanks a lot to share the light inside SE.

Happy holiday!


thumbup1.gif

#28 Scottie

Scottie

    Psycho Mom

  • Admin
  • 6,293 posts
  • Location:Columbia, SC

Posted 03 July 2005 - 03:01 PM

Ultimately, the engines aren't trying to come up with the best algo... they are trying to find mathematical ways to determine what HUMANS would think is a good, relevant site.

As interesting as all these papers are... the practical side of working with a site still dictates that you try to create a "real", useful site that is worth linking to AND visiting and you should naturally do well with any algo changes that are enacted.

It seems a lot of these little nuances are most interesting to people who are trying to find a way to achieve the rankings without really coming up with a great site... cause that's hard work and it takes a long time.

I'm not saying that it's not interesting to follow these potential changes, just that in putting them into practical use, great sites will naturally get bookmarked often and visited a lot as well as linked to by relevant industry authority sites... so ... does it really change how you work?

#29 xan

xan

    HR 3

  • Active Members
  • PipPipPip
  • 61 posts

Posted 03 July 2005 - 04:10 PM

Scottie,

I couldn't agree more.

#30 Bill Slawski

Bill Slawski

    HR 4

  • Active Members
  • PipPipPipPip
  • 117 posts
  • Location:Newark, Delaware, USA

Posted 03 July 2005 - 04:44 PM

QUOTE
It seems a lot of these little nuances are most interesting to people who are trying to find a way to achieve the rankings without really coming up with a great site... cause that's hard work and it takes a long time.


Great content, and a great site are the best way to achieve rankings. I have no argument there, Scottie.

But, I do find myself very interested in materials that come directly from the search engines, and I think that it is important to pay attention to them. In many ways, a patent application from Google, or a white paper on the Google Labs site, or in the Stanford database is one of the best ways to see what the folks at the search engines are thinking about, and what the future might bring us.

Chances are very good that an excellent web site, that doesn't have any spidering issues or technical challenges to overcome will continue to rank well in search engines regardless of what changes come about.

Xan mentions the value of peer review in Information Retrieval studies. On forums, in blogs, in articles from some of the online publications that discuss search engines, the information that comes from those sources don't stand up to that type of rigorous scrutiny.

I agree with Jill that the historical data patent application was fun because it discussed a lot of things about the ways a search engine could work from a practical stance. It was also informative in that it gave us some insight into what the scientists who work on search engines are thinking about. Paying attention to this type of stuff doesn't have to mean that we are looking for ways to game the system. What it can mean, and does for me, is that we are performing some level of due diligence, risk assessment, and taking a proactive approach to what we do when we do SEO.

For instance, some of the lessons that I got out of the historical data patent application, are grounded in good common sense, which is played out in the application:

1. Look at the content on your pages, and if it needs to be updated, make those changes. Provide material that isn't just helpful and informative, but is also fresh and up to date.

2. Check the links on your pages, and if the content on the other end has changed, make sure that you reflect those changes in the link text you use, and the text around those links. Fix broken links when you can, and if your link is redirected via a permanent redirect, change it so that it points to the right place.

3. Provide material on your page that is valuable enough that authoritative and trusted sites would consider linking to it.

There are more, and frankly, the above are things that I've been trying to do anyway, but it's good to see that Google has indicated that it is considering stuff like this.

I think that it's more important to look at one of these applications, and not think how the search engine is using it, or might use it, but rather to think about why they might include what they do in the application. Thinking critically about them could even include understanding that some parts might be intentionally disinformation, not just for the sake of the folks who work in the SEO field, but also for others who build search engines.

It's great to see someone like Xan here, adding the perspective of a person who works with Information Retrieval. What I like about patent applications, Xan, is that they aren't "hearsay" if we look at them as an indication of what the search engines have found important enough to attempt to claim as intellectual property that they can exclude others from using, or that they are willing to tell others about. Those things may not be what the search engine is actually doing, but in many ways they are something to consider seriously, and pay attention to.

They can provide some insights into what what search engines may find important. They can help us possibly understand some of the directions that the search engines are taking. I understand your concern about people possibly missing the bigger picture by focusing on smaller things, but the bigger picture is made of all of the smaller things. By looking at the parts, and trying to see how they might fit into the whole, it's easier to get that glimpse. While I think that it's important to look at something like these patents, I'd also stress that they shouldn't be used as a roadmap on how to build pages and sites. They are potential roadmaps, that the search engines might follow. But the chances exist that better ideas come along during the process of trying to implement some of the ideas presented in one of these patent applications. Or even that they are making others think that they are following one direction when they could be following another.

QUOTE
Is it the case that individual aspects of the patent, eg the mining of bookmark data for example would effectively say, that no other company could ever use this particular method in any algo ranking score? If so, then this seems a little severe to me, if not down right cheeky even, and certainly relegate my disinformation suspicions


It is possible to apply for a patent before you are capable of using the invention described within it. And once you do file a patent, it doesn't grant you the right to use the described invention, but rather the ability to exclude others from using it.

QUOTE
I'm not saying that it's not interesting to follow these potential changes, just that in putting them into practical use, great sites will naturally get bookmarked often and visited a lot as well as linked to by relevant industry authority sites... so ... does it really change how you work?


Since I started paying more attention to patent applications, and some of the white papers that seem to go along with them, I've seen a number of things that I could do, and have done that does change how I work. Often a patent application is a good starting point towards doing more research with white papers, articles, forum threads, technical specifications, and hands-on experimentation.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users