126.96.36.199 MyDomain.com - [11/Apr/2004:20:38:08 -0400] "GET /robots.txt HTTP/1.0" 200 97 "-" "http://www.almaden.ibm.com
/cs/crawler [c01]" "-"
I don't recall ever seeing it before. Almaden is evidently a research and development arm of IBM. It does have its own publicly available internal search engine, but I didn't find anything off of the IBM site listed.
I trolled around through their site this morning and according to the research papers I found it appears they're doing some fairly sophisticated stuff regarding search ranking, pattern matching, text analysis, link structures, etc.
So the question I have is, does this mean IBM is developing a search engine to jump into the market too? Or working for someone else? And if not, if it's only for their internal use, why are they sending a crawler out to one of my sites?
FWIW, the crawler only seemed to pick up the robots.txt file and main index page of my site so far. I'll be keeping an eye out for it in the future though.