Have you nominated someone for a Crunchie today? »
Google Beats Cuil Hands Down In Size And Relevance, But That Isn’t The Whole Story
by Michael Arrington on July 27, 2008

Search engine Cuil launched earlier this evening, claiming a bigger index size (120 billion web pages) than Google or any other search engine. The pedigree of the founders and execs, which includes three ex senior Googlers, means the service will be compared to Google from day one. And the way they will be compared is index size and, more importantly, relevance/ranking of results.

We’ve been testing the engine for the last hour. Based on our test queries Cuil is an excellent search engine, particularly since it is all of an hour old. But it doesn’t appear to have the depth of results that Google has, despite their claims. And the results are not nearly as relevant.

A search for Dog returns 280 million results on Cuil and 498 million on Google. Judging relevance of results is subjective, but Google returns Wikipedia as the first result, then dog.com. Cuil returns Dog.com, wikipedia isn’t listed on the first page of results. Both are meaningful results, but Google is better.

More searches, Cuil v. Google: Apple (83 m v. 571 million) – neither mention the fruit. France (102 m v. 1.5 billion) – Cuil’s category refinement makes their results better for this query. Stonehenge (800k v. 8.5 million). Silicon Valley (3.2 m v. 24 m). Techcrunch (600k v. 6.5 m).

It seems pretty clear that Google’s index of web pages is significantly larger than Cuil’s unless we’re randomly choosing the wrong queries. Based on the queries above, Google is averaging nearly 10x the number of results of Cuil.

And Cuil’s ranking isn’t as good as Google’s based on the pure results returned from both queries. Where Cuil excels is with the related categories, which return results that are extremely relevant. With Google, we’ve all gotten used to trying a slightly different search to get the refined results we need. Cuil does a good job of guessing what we’ll want next and presents that in the top right widget. That means Cuil saves time for more research based queries.

And I want to reemphasize that Cuil is only an hour old at this point, Google has had a decade to perfect their search engine.

Advertisement

Responses

Comments rss icon

  • good summary. I like their UI as well, except the black front page. too obvious I am doing a search in the office.

  • “Google returns Wikipedia as the first result, then dog.com. Cuil returns Dog.com, wikipedia isn鈥檛 listed on the first page of results”

    Good! Google is completely overrun with Wikipedia results. They are an epidemic!

    Thanks Cuil, sounds like I wasn’t the only one tired of having to skip over Google’s token Wikimedia results.

    • When I search, the first thing i do is Ctrl+f and enter wiki, so that I go right away to wikipedia results if any. (I dont enter wiki in the keyword or do a wikipedia site search because I dont want to go to them unless they are in the most relevant 50 results)

    • I would argue that if there is a page in Wikipedia for the query term, that it most likely should be listed somewhere on the first page. What better way to learn more information about a search term than to view an encyclopedia entry?

  • What’s the point in having many more search results than Cuil? Are you or anyone going to the 129288382382 page to find the 1232342342343 result? Relevance is more important than pure # of results I believe

    And oh BTW try going beyond result (or page) 1000 on the google search results – nope, you cannot do that. Google’s search results wont let you do that.

    • Yes, it’s irrelevant for popular terms, only the first few pages of results matter. But for more obscure terms size of index matters a lot – it means some results might be returned for a query instead of nothing.

      • If the index is too large it is bound to include low quality documents which kills relevancy. In fact most of the “tail” queries I attempted on cuil came back with no results or irrelevant results.

        still very very early but smells like they released too early and hyped expectations too much…

      • >But for more obscure terms size of index matters a lot

        OK, but then you should have tested those search terms. A comparison of the billions from self-reported result numbers doesn’t show which search helps me to find what I am looking for.

      • 1. Yes, for more obscure terms I did get better results.

        –> “prashant2228″ returns 119 in Google and 318 in Cuil.

        2. Cuil seems to undermine the matching *term* in url.
        –> prashant2228.com was first result in Google for the above search, whilst it appeared nowhere in Cuil.

        3. Cuil is a lie.
        –> It boasts of 318 results on the top-right but allows browsing only 3 pages with meagre 18 results.
        –> Well, unless you are willing to manually type pi=7 or pi=8 in the url-string.

      • Cuil only lets you view the first 23 pages anyway.

  • This is sad:

    http://www.cuil...m/search?q=cuil

    We didn鈥檛 find any results for 鈥渃uil鈥

    Some reasons might be…

    * a typo. Please check your spelling.
    * your search includes a term that is very rare. Try to find a more common substitute.
    * too many search terms. Please try fewer terms.

    Finally, try to think of different words to describe your search.

    About Cuil | Your Privacy | Add Cuil to Firefox

  • It’s not just that the indexes are old. They’re MONTHS old. I blogged about it this past hour but must return to my code editor.

    It’s not twice as fast crawling as advertised.

    http://tinyurl....om/CUILNEWSLIVE

    • I agree…Many months old. We changed our website layout back at the beginning of the year…cuil is indexing the old layout pages! I’m also not happy with the images show next to search results. In searching our company, the results have images that don’t have anything to do with our site. For example, we are a web optimization company, but the image that appeared with our listing was that of a nearby college logo. Another search provided an image of a tree. WHAT??

  • to me them having a new search interface is not a plus. A new user wants to find what he wants, he doesn’t want to learn entire new user interface to do something simple.

    My prediction, one of the big companies will buy them for 200-300 mil

    • Interesting prediction but we rarely find any search engines competing with Google acquired (by Google or anyone else). The last example was Microsoft – Powerset but in that case it was obvious that Microsoft wanted to use the technology they already have in place after years of research and development. Do you really think anyone will need Cuil’s technology? I strongly doubt that – the keyword-rich approach does not prove to work all that good. Only if they manage to prove that Google needs their scalability algorithm for itself – but the company was definitely launched to compete with Google and probably they will continue that way for some time, at least until they run out of money.

  • for “perez” perezhilton.com isnt even on the first page == fail

  • Hi I have tried searching “wipro techno centre” and no results were found. But google gave me the webpage.
    We didn鈥檛 find any results for 鈥渨ipro techno centre singapore鈥

    Some reasons might be…

    * a typo. Please check your spelling.
    * your search includes a term that is very rare. Try to find a more common substitute.
    * too many search terms. Please try fewer terms.

    Finally, try to think of different words to describe your search.

    About Cuil | Your Privacy |

  • Cuil would just come and go.. Google would be the basic and static one for everyone!

  • they seem to be reindexing constantly based on queries that are coming in. for example a post above says “press release” returned no results and now it does.

  • The difference is much more pronounced with niche searches.
    I picked a random sentence from a page I got from a previous search on Cuil, and searched for it on Google and Cuil.
    The sentence was: “Just ask a group of teen internet entrepreneurs”
    Cuil did not find any results, and Google found 283, including off course the page I took it from..

  • The results I get for inprnt (my site) are a bit odd. It comes up with tons of proxy sites like concealme and antisurfer, then a ton of what seem to be either fake results (spam) or expired domains… Not very useful. While the excerpts clearly reference my site, the content is completely different.

  • I’m impressed for such a young product, particularly with the UI and related categories as you mentioned. Relevancy will improve over time.

    Congrats on a great launch! Search is one of the hardest things to launch just based on the diverse set of queries people will “test you out” with.

    • surely you jest? can’t announce a release w/all that hype, the come so short of even the meekest expectations and suggest that that they did a great job. as for relevance over time, they claimed this as their initial expertise and the background of their team also claims an understanding of spam issues, yet my most basic queries have yielded spam pages. sorry, but this assessment can only come fm a friend of the company’s ;)

      these are not a group of rookies making a go of it. monier alone should know more about this and certainly claims to understand relevance. don’t know the rest, but reading their bios suggests that they should be much further along for this release.

  • This is intersting. I have not heard anything about it. Have they detailed the PPC model in any detail? How is the traffic thus far?

  • How do you know the “results in index” are accurately reported by each search engine and that the numbers are meaningful?

  • You know, I found 121,578 results for “cuil”. From my (somewhat expert) opinion, it looks like they’ve focused on the search index, but their frontend is not behaving properly.

  • My experience has been that the results seem to filter out a lot of the crap that I get with google. And I like the results page more.

    The actual results are a mixed bag but I will use this site a lot.

  • I guess if people really start using this engine, they might get better.

    http://blabtech.blogspot.com

  • Google admittedly says the size of the returned results are inflated… it’s more of a guess than an accurate number.

  • It does show wikipedia results!

    http://www.cuil...on3&sl=long

    One of the first results, too.

  • For my site http://www.cuil...earch?q=panedia Cuil offers all sorts of links in the first 10 pages including scraped content sites and spam sites, but not my hompage panedia.com I think the algorithm needs a little more work.

  • google seems to lie

    http://www.goog...=N&filter=0

    i cant get even close to the billionth search result. anyone know how?

  • I rely daily on Google’s index of tech forums for quick answers to common problems. A search this morning for ["allow anonymous property queries"] gives me page after page of people with the same problem as me, and a very quick fix. Cuil has nothing useful. It even seems that “advanced” search operators (such as quotes around a string) are ignored.

    Maybe the tech market aren’t who they’re aiming for here, but I’ll be sticking with Google for a while yet.

  • @anon,

    You don’t know how search engines work. The results are compiled by a unit of the search engine and are cached to mirrors. What I blogged about is that Cuil does not have JIT collating where as Google and even really, really cheap search engines do.

    Only the top 10-100 pages will be fed to the caching servers.

    A search engine is separated into many, many units. While 100 pages and the count will be fowarded to the front end unit, it doesn’t mean that those results are not there produced from the pagerank sorting.

    Read the book, Google Pagerank and beyond for more info on that.

    http://tinyurl....om/CUILNEWSLIVE

    • i dont claim to know how search engines work. im a user that sees 1,570,000,000 results and i wanted to see the last 100 and i couldnt. i think ill email google a bug report.

  • Maybe people are forgetting that many db engines don’t return accurate results / row count numbers… the “we found x results” is most likely a highly inaccurate count of results returned by the database.

  • “The pedigree of the founders and execs, which includes three ex senior Googlers, means the service will be compared to Google from day”

    This is obviously incorrect Mike. If this was the cause, there would be no senior Googlers.

  • All that work for nothing. “Google” is permanently stuck in people’s minds.

    “CUIL” means nothing and is a perfect example of a horrible web 2.0 name. No one will remember it.

    Oh, and the searches I did turned up completely random results and there wasn’t any cache pages, image search, etc.

    It’s just search…and the whole column thing seems unfair to people who aren’t on that very tiny first landing page.

    Thumbs down. I’ll stick with Google & Live Search.

  • I just read your earlier article announcing the engine and was excited about it. So, I did some testing of my own and immediately came to some of the same conclusions. Although, I think we can give Cuil a bit of a break seeing that they are only a few hours old. Either way, this seems like a very promising project. Thanks.

  • it’s terrible for long tail sites… I searched for some terms that gave me like 400 correct site oriented results on google and zero on CUIL. So much for that. I suppose it you want the fat part of the tail, CUIL may sport some advantage…. but past the top 5000 sites or whatever, it may be hurting.

  • Based on my testing the relevancy of the top results for cuil is terrible.

  • The test I use in the SEO classes I teach is “chocolate tasting,” since Google returns my page (http://troubado....org/chocolate/) first and Yahoo returns a completely useless page ( a link to a long-defunct tasting). Now, if someone is searching for that particular term, they probably either want to do a chocolate tasting or want to see what chocolates score well. Google hits it, Cuil gives basically useless results: an odd assortment of chocolate tasting events and commercial sites. Furthermore, similar and same results are repeated as you drill down through the listings.

    Also the two-dimensional layout is poor: it is hard to go through in an organized fashion (especially since the last ones in the column hang below the fold) and makes it harder to understand which ones are more relevant: the top of the second column or the bottom of the first?

    • Google lists your personal page http://troubado....org/chocolate/ on the first result? man, you must do a lot of SEO to boost it, your site is rather useless to me, another reason I am using Google less and less. They just have too much spamers like you.

      • Actually, I have done basically no SEO until three months ago when I improved the internal links as part of a graphical face lift. It was #1 before then anyway.

        If you bothered to look at it, you would see that the site is basically a posting of ten years’ worth of data chocolate plus a description of how to do a chocolate tasting. As I mentioned above, I can see how that would be useful to someone searching for that particular keyword, and might even be the most useful out of the other top-ranked results.

        This is, of course, the real test of a search engine: its ability to give the result people are looking for. And, if I were to try to game the system, I would do so by improving the content of my page to better serve the desires of people searching for that particular keyword (actually, I’m planning on rewriting the page to give better instructions and be more interesting.)

        Seeing as how I explained this briefly in my original post, I find your reply, replete with grammatical mistakes, rather inane.

  • Of course google is better. It’s the cold start problem — you need to track which links people click in order to boost their results.

    cool doesn’t have any of that yet.

  • Hmm, searching for cuil doesn’t even return cuil.com on the first page. That’s rather odd for a search engine that is meant to be relevant.

    So let’s look at it this way. If you search for stuff that google won’t give you, use cuil :)

  • I still think Live Search has a better UI. Especially when you search for stock symbols, which I tend to do quite often. I like the categories of cuil, but certainly don’t like the UI.

  • I think cuil still needs to go a long way. Search is not the size of the indexed pages.

  • Mike,

    It doesn’t matter whether Cuil is 1 hour old or not. The fact of the matter is that people arent going to use the service if its not turning up relevant results. Of course its going to be compared to Google because Google is by far the best.

    Cuil should not have gone so public until they did more testing. The results aren’t relevant – and really in terms of competitive edge – Google could just implement the “categories” selector (which is useful) at any time – so its hardly awesome at this point.

    I searched for a whole bunch of stuff and Cuil just didnt return what I was looking for.

  • complete and utter failure.. it doesn’t find shit! who cares if they’ve got a trillion pages when they bring up completely irrelevant search results for even pretty common terms.

    • Ever tried building even a 1 page bug free website on your own, huh?

      • yes i have actually, and it has nothing to do with my comment. bugs aside, their search engine just doesn’t bring up relevant results, full stop. unless you want to call their whole site one giant bug in that case it’s a pretty big/expensive one. :P

  • I tried “bulgaria” and it returned no results. Then I tried “Bulgaria” and it returned a bit, but pages 2+ didn’t work.

    I think Cuil is just buggy at the moment. They rushed to launch it and either the spike in traffic is breaking their engines or it’s something else, but it’s not usable at this point.

  • Your last point is very important. Give them some time, and appreciate they were willing to take on Google, and hope like hell they don’t sell out to them. We need Google to have some competition, and Google needs some competition as well.

  • Site is very buggy.
    Search results from homepage search textbox return results for “ZA”. Search results from second search page for “ZA” return nothing.

    Auto-linking web search text from the “Explore” bar works terribly and often returns zero results.

    Click the 4,5,6 etc links on the bottom of the page and get a “no results” error.

    No results are not location friendly, not even Country-friendly? I can’t read French…doesn’t my IP tell you I’m in the US?

    This search site sux and is a total failure.

  • I have been testing cuil myself for the last hour or so, and the results are fairly relevant. I do not care about either google/cuil returning millions of results, I only care about the first 15 pages.

    I tried the search term “cuil” and a couple of time, it showed an error and when it did eventually returned results, none of them show what “cuil” itself is.

    I am sure they are working on their algorithms every day the results will get better. It is too early to pass judgment on something of this big size and importance and I wish Cuil good luck.

  • Noone was expecting a flawless bug-free launch. But the failures are too many to have any amount of confidence in this site. If 80% of these “glitches” are not fixed in the next 7 days, good riddance.

    Stupid greed and arrogance of ex-Google employees. Bad Luck.

  • Bizarre results.

    7,707,000 results for john mccain jokes

    We didn鈥檛 find any results for 鈥渂arack obama jokes鈥

  • FYI, the founder of the site is wrote this article years back:
    http://www.acmq...age&pid=143

    i recap one of the paragraph:

    “Don’t do page rank initially. Actually don’t do it at all. For this observation I risk being inundated with hate mail, but nonetheless don’t do page rank. If you four guys in your garage can’t get something decent-looking up without page rank, you’re not going to get anything decent up with page rank”

    ………………..
    that tell us why the result are so crap

    and one more paragraph:

    “NO ROOM FOR ERROR – When you look at all these steps and all the complications, this process is rife with things that go can wrong. The hardest part about writing a search engine is that you’re going to process billions of URLS and serve millions, if not billions, of queries. This does not leave a lot of room for error. One super-linear algorithm applied over the wrong-sized list of items and you are sunk. One lock inside another lock and you are sunk. There will be no code paths not explored. All of those comments in your code, which print out errors like “This will never happen,” will happen”

    …………….
    and they just doin it

  • My brain can’t handle scanning the results both vertically and horizontally. I definitely prefer the presentation of Google/Yahoo/MSN/Ask results… hopefully Cuil can follow suit, or provide a setting for it.

  • Cuil needs a lot of work before it can be considered mature. I dont see them having any feature that will upset Google.
    - Cache? No.
    - True Relevancy? No.
    - Related Widget? Yes, but Google could implement that tomorrow and leave it in beta for another 2 years or more.
    - News Search? No
    - Image Search? No
    - Video Search? No
    - Search History? I think Google has this.
    - Customer Loyalty? Definitely not leaving Google, at least not yet.

    Even a direct search for my domain (including the .com) name does not return a link to my site. BS

  • Even the snippets of results Google provides is more useful:
    http://www.cuil...per&sl=long
    first result is spam, the rest – although vaguely useful-looking – are not the best results possible as shown by:
    http://www.goog...en&safe=off

  • I live in Sweden, so naturally I searched Cuil for HV71, the Swedish ice hockey champions 2008. Cuil did not find a single page. Google? 1 800 000 hits. (Starting with the official web site.) Clearly Cuil has some indexing to do. The UI is pretty nice, though, so let’s not leave them for dead just yet.

  • searching my name on google gives my personal website…while cuil ( not cool exactly ) gives links to inappropriate(porn) , broken links…

    these all (immature) search engines from MSN,YAHOO to cuil,powerset adds real value to Google’s search technology and index size…

    but i appreciate all second hand search engine companies as they at least tried….. making monopolized market to discrete

Leave Comment

Commenting Options

Enter your personal information to the left, or sign in with your Facebook account by clicking the button below.

Alternatively, you can create an avatar that will appear whenever you leave a comment on a Gravatar-enabled blog.

Trackback URL
bugbugbugbug
Techcrunch on Facebook