Archive for July, 2008

I liked it before it was Cuil

It was late 1998. A colleague waved me over to his desk. “Look at this” he said, “it’s a new search engine”. He was pointing to a page that was mostly white background, an ugly multicoloured logo with a stupid sounding name, and a single text box. I was sceptical. This was totally different to the information dense ‘portal’ style pages currently in use by Alta Vista, Excite, Infoseek, Yahoo and others. “That’s ugly” I said. “Yes, but it delivers really good search results” he replied. And that was that. I started using Google. It was a better mousetrap. Not a fancier, frillier, more trendy mousetrap, just a better one. This is how disruptive technology happens. Sure, marketing is important, but only if its promoting a product that is actually better.

Someone just recently introduced me to Cuil. It’s a new search engine, founded by one of the architects of Google’s large search index, and a computer scientist from Stanford. Their staff are well qualified, and they’ve got good investor backing. Could they be the new Google?

I don’t think so. Here’s the reason. It’s got nothing to do with Google’s $4B annual profit, or the more than $1B they invest into R&D every year. It’s a question of philosophy. Not the “treat your employees really well” philosophy, but the “how to make search work well” philosophy.

Here’s Cuil’s claim about their point of difference:

“Rather than rely on superficial popularity metrics, Cuil searches for and ranks pages based on their content and relevance. When we find a page with your keywords, we stay on that page and analyze the rest of its content, its concepts, their inter-relationships and the page’s coherency.”

What they’re saying, if I understand correctly, is that semantic analysis is more powerful than distributed cognition. One of the key things that powered Google’s success was that they based part of their relevance testing on the number of links that pointed to a particular page. They analysed the Web, as well as the content in each page. They figured, if lots of people link to this page, it’s probably useful.

Don’t get me wrong, I love semantic analysis. I’m an etymology geek. I get fascinated by morphemes, taxonomy, and topic maps. It’s just that I agree with James Surowiecki when he argues that (to paraphrase the Wikipedia article on ‘The Wisdom of Crowds’) “Market judgment, can be much faster, more reliable, and less subject to political forces than the deliberations of experts, or expert committees”. Distributed cognition just seems to scale better than semantic analysis. The problem with semantic analysis is that someone has to write the algorhythms that do the analysis. That person has to make judgments about how meaning is constructed in language, and in so doing they arbitrarily close off parts of the probability space. To me, within an information domain as diverse as the Web, there’s just too much variation between signifier, signified, and referent, as used by the billion or so authors and searchers, to enable semantic analysis to work as an exclusive technique.

I’ve heard Dave Snowden say many times1 that semantic analysis has its place, but like Newtonian physics, only within boundaries. I wonder that Google’s success is because they do this, harnessing the power of both approaches.

I’ve also tried Cuil on a number of different searches, and it’s just not even close to being as good as Google at finding what I was after. Regardless of which theory of language/cognition resonates with you, in the end, it’s results that count.

[1] See this article and various podcasts on the Cognitive Edge site.

You can only talk rubbish if you're aware of knowledge.
Karl Pilkington