Are you a Google Analytics enthusiast?
Share and download Custom Google Analytics Reports, dashboards and advanced segments--for FREE!

www.CustomReportSharing.com
From the folks who brought you High Rankings!
More SEO Content
New Google Patent Application
#16
Posted 01 July 2005 - 05:12 PM
So then, would bookmarking by clicking a "bookmark this page" link on a page be counted too? (rather than bokomarking by doing it through your browser menu).
#17
Posted 01 July 2005 - 06:52 PM
At least I hope not...but maybe...?
#18
Posted 02 July 2005 - 02:38 PM
http://spaces.msn.com/members/search-scien...g!262.entry
Edited by xan, 02 July 2005 - 02:54 PM.
#19
Posted 02 July 2005 - 05:26 PM
#20
Posted 02 July 2005 - 05:47 PM
Might as well include the update to the patent, from this last Thursday, in your blog post:
Systems and methods for determining document freshness
And, as for the bookmarks and favorites, and how Google would get that information, it would possibly be through the bookmark manager described in this followup patent application from June 16th:
Methods and systems for personalized network searching
The Freshrank application above is very short, and informative. The Bookmark manager application comes out a little invasive to me.
#21
Posted 03 July 2005 - 12:10 AM
So, basically Freshrank would be determined by some combination of the last time the page was updated and/or the freshness of linking documents (the freshness of which, in turn, would be determined in this same way), possibly factoring in the time each inbound link was established? And that would then be incorporated into the ranking algo.
#22
Posted 03 July 2005 - 03:16 AM
And looking at the pictures does help in a lot of these patents. (At least it helps me. )
This followup from Monika Henzinger cuts out all of the personalization language, the spam fighting talk, considerations of domain registration information, popularity trends in searches and links, matches between anchor text and the subject of the content of the page, whether the links are from trusted or authority sites, frequency of changes over time in links and content, and many other issues that were raised in the historical data patent application.
I think the step of going through it carefully, and understanding it makes it a good stepping stone to understanding the longer and more complex application. The longer application has a lot of authors, and this application from Dr. Henzinger reads somewhat as if it might be part of her contribution to that application if it were condensed out of the longer document. Some of the ideas in her stand-alone invention appear to be expanded upon a lot in the longer piece.
The bookmark manager application is co-authored by another one of the inventors listed in the Historical Data application, Steve Lawrence. If a bookmark management program like the one described there is developed, it would make the "User Maintained/Generated Data" section of the Historical Data patent application more likely to become reality.
#23
Posted 03 July 2005 - 05:31 AM
That said, many of these patents ( and they seem to pop up almost daily these days) tend to cover so many bases that if you took too much on board and believed everything that was written then you'd be forgiven for starting to get a little confused as to what was doing what where and when in any algo process.
What is apparent in a stating the obvious kinda way, is that technology is progressing and the SE's are certainly making use of these improvements and that greater use will most certainly be made of user behaviour data. How this is used of course is the million dollar Q, just because a patent says x y or z does'nt mean that the answer is laid out for all to see, its simply an indicator of what they might be doing with something and how it could be applied. I do smile at the various posts that seem to imply that because its been written (patent) it must be what they will be using soon, when the facts are, that they could be using any number of things, and are certainly not going to be telling us about it! I think it might just be disinformation thing, throw in more stuff, muddy the waters a little more, marginalise SEO. Freshrank, Trustrank, Pagerank, Linkrank, HITS ...
Am I a little thick in my belief that the purpose of a patent was to protect some intellectual idea/design. The funny thing with regard to an SE algo type patent is that realistically, how could anyone really really know, without gaining access to the algo itself, whether competitor new-search-engine.com was using the patent described or not, in other words, what is the point in filing the patent application, when for all intents and purposes its more or less impossible to police. What am I missing here?
Is it the case that individual aspects of the patent, eg the mining of bookmark data for example would effectively say, that no other company could ever use this particular method in any algo ranking score? If so, then this seems a little severe to me, if not down right cheeky even, and certainly relegate my disinformation suspicions
#24
Posted 03 July 2005 - 07:17 AM
Might as well include the update to the patent, from this last Thursday, in your blog post:
Systems and methods for determining document freshness
And, as for the bookmarks and favorites, and how Google would get that information, it would possibly be through the bookmark manager described in this followup patent application from June 16th:
Methods and systems for personalized network searching
The Freshrank application above is very short, and informative. The Bookmark manager application comes out a little invasive to me.
Hi there,
I am aware of new developments, but I hardly ever include anything that hasn't been implemented and tested, or peer reviewed (i.e.computer scientists at conferences, checking prototypes first hand and so on). A lot of my readers are computer scientists, and writing about patent applications and things which have not yet been verified wouldn't be in line with expectations, and I would quickly get lots of mail about it! Its fine to think about what this could mean for the future of search and so on, but the thing is that it is hearsay, and fabrication until checked first hand.
I also had a read and thought that it was again nothing new. It shows that there is work going on in page freshness, and personalization, but I am sure that you already new that, so it doesn't change much as far as new information goes really.
It does say " The present application is a continuation-in-part of U.S. application Ser. No. 10/748,664 (Attorney Docket No. 0026-0058), entitled "Information Retrieval Based on Historical Data" and filed Dec. 31, 2003", and this easily pointed towards this.
Corpus freshness is an issue which is not new, and has been worked on a great deal. The bigger the corpus, the less it is fresh. The reason this method has been worked on is because different documents have a different score of freshness. For example 6months old news is too old, but a 6month old document on new physics experiments would be considered recent. There's the issue of things that were considered old new becoming relevant again as well amongst other things.
A workshop from 2002 gives a good overview of the main problems in IR, including this problem of freshness and a simple way, but drafted by some pretty good people in the field:
http://sigir.org/for...challenges2.pdf
The bookmark thing is being done by a lot of different companies and you'll see a lot more of these being used for different things. Personally, I am not too concerned about them collecting my info. As long as it makes things easier for me to do, cool.
IBM was granted 3,277 patents in 2004 alone. I could be wrong but I think if I remember righty that Google had 7. MSN filed 2000 in 2004, and expect to file 3000 in 2005 (check shareholder reports). My point is that many other companies are investing in mass innovation, and MSN will pull some nice things out of the bag. Some of the things in the Google applications are not really ground breaking, but sensible. It does give you an idea of where things are heading for sure. To me the other things that Google have released are much more impressive and interesting. I do understand however that they don't have anything to do with SEO. My concern is simply that people forget about the rest of the industry and this means that they might not be prepared for changes, through focusing on smaller things. No doubt things will change with Google IR methods, and I would hope so, or they would quickly go out of business!
I won't always blog about these things, but it doesn't mean that I won't discuss them in forums. The blog isn't especially SEO oriented or for Uni students or computer scientists either, but just for anyone interested in any aspect of IR. I don't make a special effort to cater for any demographic group in particular. This is because I think that if you are in the business of IR, you should embrace all aspects, they all intertwine!
It is interesting to me to read your responses to changes like that and also to discuss them with you, so thanks! I enjoy contributing to the forum when I can, and appreciate what I learn here.
#25
Posted 03 July 2005 - 07:56 AM
I too wonder how much of these patents are purposely designed to be part of a disinformation campaign. The search engines surely realize that they do not live in a vacuum, and know people read those patent applications trying to discern what is happening today and may be happening tomorrow.
Honestly, much of the overly general stuff included in some of those patent apps seem silly to me. Almost as bad as other absurd patent apps claiming some right to control license to methods of swinging on a swing.
#26
Posted 03 July 2005 - 08:51 AM
That was cool!
Next up will be for me to write about the TrustRank paper. That was another really interesting one with cool implications for site developers and SEOs.
#27
Posted 03 July 2005 - 09:48 AM
It is very informative post and very nice blog. Thanks a lot to share the light inside SE.
Happy holiday!
#28
Posted 03 July 2005 - 03:01 PM
As interesting as all these papers are... the practical side of working with a site still dictates that you try to create a "real", useful site that is worth linking to AND visiting and you should naturally do well with any algo changes that are enacted.
It seems a lot of these little nuances are most interesting to people who are trying to find a way to achieve the rankings without really coming up with a great site... cause that's hard work and it takes a long time.
I'm not saying that it's not interesting to follow these potential changes, just that in putting them into practical use, great sites will naturally get bookmarked often and visited a lot as well as linked to by relevant industry authority sites... so ... does it really change how you work?
#29
Posted 03 July 2005 - 04:10 PM
I couldn't agree more.
#30
Posted 03 July 2005 - 04:44 PM
Great content, and a great site are the best way to achieve rankings. I have no argument there, Scottie.
But, I do find myself very interested in materials that come directly from the search engines, and I think that it is important to pay attention to them. In many ways, a patent application from Google, or a white paper on the Google Labs site, or in the Stanford database is one of the best ways to see what the folks at the search engines are thinking about, and what the future might bring us.
Chances are very good that an excellent web site, that doesn't have any spidering issues or technical challenges to overcome will continue to rank well in search engines regardless of what changes come about.
Xan mentions the value of peer review in Information Retrieval studies. On forums, in blogs, in articles from some of the online publications that discuss search engines, the information that comes from those sources don't stand up to that type of rigorous scrutiny.
I agree with Jill that the historical data patent application was fun because it discussed a lot of things about the ways a search engine could work from a practical stance. It was also informative in that it gave us some insight into what the scientists who work on search engines are thinking about. Paying attention to this type of stuff doesn't have to mean that we are looking for ways to game the system. What it can mean, and does for me, is that we are performing some level of due diligence, risk assessment, and taking a proactive approach to what we do when we do SEO.
For instance, some of the lessons that I got out of the historical data patent application, are grounded in good common sense, which is played out in the application:
1. Look at the content on your pages, and if it needs to be updated, make those changes. Provide material that isn't just helpful and informative, but is also fresh and up to date.
2. Check the links on your pages, and if the content on the other end has changed, make sure that you reflect those changes in the link text you use, and the text around those links. Fix broken links when you can, and if your link is redirected via a permanent redirect, change it so that it points to the right place.
3. Provide material on your page that is valuable enough that authoritative and trusted sites would consider linking to it.
There are more, and frankly, the above are things that I've been trying to do anyway, but it's good to see that Google has indicated that it is considering stuff like this.
I think that it's more important to look at one of these applications, and not think how the search engine is using it, or might use it, but rather to think about why they might include what they do in the application. Thinking critically about them could even include understanding that some parts might be intentionally disinformation, not just for the sake of the folks who work in the SEO field, but also for others who build search engines.
It's great to see someone like Xan here, adding the perspective of a person who works with Information Retrieval. What I like about patent applications, Xan, is that they aren't "hearsay" if we look at them as an indication of what the search engines have found important enough to attempt to claim as intellectual property that they can exclude others from using, or that they are willing to tell others about. Those things may not be what the search engine is actually doing, but in many ways they are something to consider seriously, and pay attention to.
They can provide some insights into what what search engines may find important. They can help us possibly understand some of the directions that the search engines are taking. I understand your concern about people possibly missing the bigger picture by focusing on smaller things, but the bigger picture is made of all of the smaller things. By looking at the parts, and trying to see how they might fit into the whole, it's easier to get that glimpse. While I think that it's important to look at something like these patents, I'd also stress that they shouldn't be used as a roadmap on how to build pages and sites. They are potential roadmaps, that the search engines might follow. But the chances exist that better ideas come along during the process of trying to implement some of the ideas presented in one of these patent applications. Or even that they are making others think that they are following one direction when they could be following another.
It is possible to apply for a patent before you are capable of using the invention described within it. And once you do file a patent, it doesn't grant you the right to use the described invention, but rather the ability to exclude others from using it.
Since I started paying more attention to patent applications, and some of the white papers that seem to go along with them, I've seen a number of things that I could do, and have done that does change how I work. Often a patent application is a good starting point towards doing more research with white papers, articles, forum threads, technical specifications, and hands-on experimentation.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users









