May 11, 2008

Powerset Launches Showcase For User Search Experience

Michael Arrington

162 comments »

Today marks another milestone for San Francisco based contextual search engine Powerset. They’ve launched a showcase for their user search experience - effectively the search engine minus the web crawl. For now, Powerset queries only Wikipedia and augments results with data from Freebase. The product launch comes just a day after reports that the company is being shopped to potential buyers by investment bank Allen & Co.

I have been able to test Powerset via their labs site for the last few weeks. I wrote about it last month, and the version that just launched is very similar.

There is no way to look at Powerset today and determine if it can be as disruptive to search as Google was when it launched almost a decade ago. That’s because it only queries Wikipedia, and so there is little need for proper ranking algorithms to sort the good from the bad results.

But what user can see is how effective a way it is to gather information quickly. For someone doing research, Powerset effectively removes a number of steps towards getting to the final information. It is particularly effective when the information needed is on many different web pages.

For example, a query on Powerset of “when did earthquakes hit tokyo” yields stunning results. Try this query at Google or even wikipedia to compare - instead of just picking out keywords that are in your query and on a web page, Powerset is actually making some sense of the content included in the wikipedia pages:

The way that Powerset returns queries means that answers are often found in the result snips, as above. They are also structuring a lot of the Wikipedia and (and already structured Freebase) data and inserting it into results. So a search for “Bill Clinton” shows results, but also shows Freebase structured data along with additional query refinements to get to more information. The important thing below isn’t the structured data in the results, its the fact that you can click on the action words and drill down into very specific queries (to find, for example, what bills he signed, or which Supreme Court justices he nominated, or who he slept with).

Powerset is indexing web pages much differently than normal search engines, which generally just record content to match against keyword queries. Instead, Powerset is trying to understand the content on the page so that it can be matched meaningfully to queries later. Even queries that don’t use matching words.

Indexing the web is expensive, though, and Powerset’s way of doing it requires even more time and computing power dedicated to a web page. That’s why they say they aren’t indexing the entire web yet - the company has raised just $12.5 million (plus another $8 million or so in bridge loans from investors). To index the web will require a new round of financing (see the first paragraph above about their sale/financing efforts).

Powerset is has taken a lot of criticism for their goal of trying to redefine how people search the web (including from us). But their lofty goals are what makes Silicon Valley so great - succeed or fail, Powerset is trying to do something pretty spectacular.

The company has also created a demo overview video - see below.

  • Sphere It

Trackbacks/Pings (Trackback URL)

Comments

This looks great… can’t wait to try it!

 

Any idea when a test drive will be forthcoming?
Thanks

 
 

Looks promising…

My first search was extremely slow though. Maybe just a little techcrunch effect.

 

hmmm I’m not too impressed. They need to open it up to other sources to become useful…

 

Nice UI and the results are presented very nicely but I get pretty much identical results if I do a site search through Google. Nice to see that they got rid of that weird logo.

 

It’s better to come out with a strong showing for specific use, than a weak one for a lot of different uses.

 

site is very slow and did not impress me. i like a clean look seo this is poorly designed for my taste

 

just test “where is Paris” and “What is Madonna first name”…not impressed.

 

Is this is what they have done after raising $12.5 million (or more?)?

 

I tried this query: “which presidents were assassinated?”. Result is useless compared to Google results.

 

<>

This actually made me laugh even though I was in a pretty serious mood.

If all it took was more funding Google would have had the solution ready last year.

Powerset doesn’t have a monopoly on genius. No one has an absolute upper edge here.

What it takes is the eventual/gradual move to a more structured Web, but I agree that Powerset is better positioned than Google (less legacy code and better timed entry) to take advantage of the more structured Web that will emerge.

More funding is only going to dilute the owners. I say sell.

 

Google beater… dream on. Try this simple test:
Who is the prime minister of Canada
http://www.powerset.com/explor.....=0&y=0

now try on google for a really stunning result:

http://www.google.ca/search?q=.....=firefox-a

 

@10, 11

Try using normal keyword-based query

I have no idea why these guys keep trying to show that they have AI Complete NLP for queries. No one does. No one will for a long time. I’ve never used Powerset, but if you try keyword-based queries then it should work better than a Google search of Wikipedia.

Else, the $12.5M would have been better spent on Falafel

 

First feature lacking [which I thought should work, or be available in theory]:

Results, which have date information, should be sorted accordingly.
e.g. a query, such as “films starring edward norton by date,” should automatically bring up the freebase results, sorted by date, not sorted arbitrarily.

In my opinion that’s pretty basic. And I am sure I can come up with countless other examples.

 

I can’t resist a comparison to Dipsie.

Dipsie launches: http://www.siliconbeat.com/ent.....nches.html

Dipsie folds: http://www.siliconbeat.com/ent.....rsity.html

OK, not entirely fair, since I think Powerset has licensed interesting technology from PARC. But I use the example to make the point that “lofty goals” and “trying to do something pretty spectacular” are a weak substitute for creating value.

Powerset has had access to great money and talent. Maybe something great will come out of it. But so far I’m still waiting.

 

I tried “Who killed Boba Fett ?” and got some garbage. So much for NLP hype.

 

This is pathetic. Even “How big is a soccer field?” works better on Google, and that’s exactly the type of query a semantic search engine should excel at.

 

Michael,

Can you provide some example queries to type in so that we too can have that “AHA moment” !

 

It’s unanimous - it sucks

 

This site doesn’t make sense to me. Is the idea here you are going to make sense of my searches when I type in a sentence? How often do I do that? I can’t see how I get anything out of this site I wouldn’t get out of Wikipedia or Google.

Its easy to claim you have the next wave. Not so easy to actually have it. Powerset sure does not have it, as far as I can tell. Then again, Facebook is the #8 site on the net….

 

this google search:

tokyo earthquake site:wikipedia.org

seems just as good as this crapset search:

when did earthquakes hit tokyo

=== stunning results? what is your infatuation with these guys?

 

I see the gap between what people expect and what Powerset actually has.

They have something valuable but they’re doing such a shitty job trying to manage expectations. In fact if they had said absolutely nothing and gave no hint whatsoever and just put it out there they would have done far better.

Try using a keyword-based query. Forget NLP on the query side. It’s hopeless and it will be hopeless for a few generations of consumers.

Use a keyword-based query. The results should be consistently better than a “Google search of Wikipedia” .. again: {Better than a Google search of Wikipedia} and NOT better than Google search of the Web.

If my assertion above is wrong then the $12.5M is a total waste.

 

It didn’t find me in wikipedia; am I so deeply buried that $12M doesn’t surface my remains?

 

OK sorry, one more:

The question posed to Powerset: what is the best way to make money on the internet?

#1 Answer: Independent music.

Wow. I mean seriously. Even ‘house cleaning’ would be a better answer.

http://www.powerset.com/explor.....submit.y=0

 

Interestingly, they seem to be shying away from the one feature I liked in the beta–namely, trying to help users formulate structured queries as they typed them. Sure, their implementation was erratic, but it was the most promising direction I saw them pursue.

In any case, Wikipedia is just a bad set to use for testing question answering (even though I know Powerset insists it is not doing question answering). Almost every top-of-mind question whose answer is available via Wikipedia is answered by a single Wikipedia page devoted to answering that question.

 

Powerset isn’t ready to advertise on this blog. This PR move shows how desperate they are. $12.5 million, and all they have is this to show for? Even Allen & Co. or Techcrunch can’t save you.

 

hate these ads. stop it michael

 

Pathetic, I thought it’s some student summer project. Apply some semantic model on a limited set of quality, semi structured set of of documents in wikipedia is not that hard. 20M down the drain, could’ve been spent better buying food for hungry children in Africa.

 

actually tried it. 100% off track.

please stop show scripted results. even something like “what is the address of techcrunch” get 100% rubbish. o my god. won’t go back.

 

“It is better to be talked about than not be talked about at all.” –Oscar Wilde

Seeing all this reaction they can always publish a $12.5M book on how NOT to screw up your messaging or even sell the rights to the Powerset Rangers line of NLP enabled toys.

 

My comment was published, then deleted????

HYPEWARE. I am a cognitive scientist, and somewhat offended by the mere idea that this thing would be ready for prime time. Nothing but ELIZA EFFECT here.

So here are my three queries:

==
1. “WHO IS THE TEXAN IN WASHINGTON?”

#
Texas Revolution
Declaring himself as the only person who could bring about peace, Santa Anna was sent to Washington, D.C., by the Texan government to meet President Jackson in order to guarantee independence of the new republic.
# close

Texas
“Texas Executes 400th Inmate”, The Washington Post, 2007-08-22. … | Demonym | Texan |
# close

Texan schooner Austin
In Galveston, Austin was at anchor while Commodore Moore met with President Houston and Secretary of War and Marine George Washington Hockley to make plans for the Texan fleet.
# close

The Texan
The Texan (TV series), starring Rory Calhoun
# close

The Texan (fictional character)
Yossarian and Dunbar meet the Texan on his admission to their hospital ward, in the first chapter of the novel.
# close

The Texan (TV series)
The Texan was a Western television series starring popular B movie star Rory Calhoun.
# close

Texan schooner Independence2
Texan schooner Independence
Lithograph of the Texan schooner Independence as flagship of the Texas Navy … In June, 1836, the schooner bore commissioners Peter William Grayson and James W. Collinsworth to New Orleans on the first leg of their trip to Washington, D.C. to negotiate the recognition of Texas by the United States.
# close

The Daily Texan
A number of comic strips that began in the Texan went on to have commercial success. … The Daily Texan
# close

Battle of San Jacinto
In 18 minutes of combat, the Texan army had won, killing about 630 Mexican soldiers, wounding 208 and taking 730 prisoners. … However, the safe passage never materialized; Santa Anna was held for six months as a prisoner of war (during which time his government disowned him and any agreement he might enter into) and finally taken to Washington, D.C.
# close

German Texan
These first immigrants settled in Austin, Colorado, Fayette, and Washington counties. … ↑ The German-Texan Heritage Society

===========================
2. “Is Iran an Apple product?”

#

List of products discontinued by Apple Inc.
A Quadra 700, part of Apple’s high-end desktop computer range of the early-1990s. … ↑ Apple - Product Support (Language)
# close

Apple TV
Apple TV Sales Will Stall at 1 Million. … Apple TV Support – official product support
# close

Apple
The apple as symbol of sexual seduction has been used to imply sexuality between men, possibly in an ironic vein. … | Iran | 689,328 | C | 2,400,000 | F |
# close

Apple IIGS
The IIGS was also the first Apple product to bear the new brand-unifying color scheme, a warm gray color Apple dubbed “Platinum”.
# close

Apple Keyboard
This was the first major redesign of the Apple keyboard, featuring more fluid, curving lines to match the look of the new Apple product style.
# close

Apple Newton
Product Details … Palm Computing was co-founded by ex-Apple employee Donna Dubinsky.
# close

Apple Inc. litigation
The suit settled in 1981 with an undisclosed amount being paid to Apple Corps. … ↑ Apple - Product Support (Language)
# close

Apple III
The Apple III was the first Apple product that allowed the user to choose both a screen font and a keyboard layout:either QWERTY or Dvorak.
# close

Apple IIc
A third party company would later introduced a work-alike LCD screen called the C-Vue, which looked and functioned very much like Apple’s product, albeit with a reportedly slight improvement in viewability.
# close

Iran
| | Iran Portal | … See also: Military history of Iran

=========================================
3. “Is Hillary a vampire?”

#
Karin (manga)
Voiced by: Mikako Takahashi (Japanese), Hillary Blazer-Doyle (English) … A vampire hunter descended from a long line of hunters.
# close

Political positions of Hillary Rodham Clinton
http://thatsmycongress.com/ind.....sions-act/ Where is Hillary Clinton on the Military Commissions Act?, Thatsmycongress.com, May 21, 2007
# close

Edmund Hillary
Hillary considered pulling out, but both Hunt and Shipton talked him into remaining.
# close

Hillary Rodham Clinton
From mid-1978 to mid-1980 she served as the chair of that board, the first woman to do so. … ↑ Hillary Rodham Clinton.
# close

Hillary Clinton presidential campaign, 2008
A week after the debate, Clinton said, “I wasn’t at my best the other night. … The focus has got to get back on Hillary.
# close

List of books about Hillary Rodham Clinton
By herself … Condi vs. Hillary : The Next Great Presidential Race. HarperCollins, 2005.
# close

Hillary Rodham Clinton presidential campaign, 2008
A week after the debate, Clinton said, “I wasn’t at my best the other night. … The focus has got to get back on Hillary.
# close

Characters of Sluggy Freelance
Sam was turned into a vampire in an early adventure, and he leaves the main group shortly after due to Riff’s mistrust. … Actually Hillary Clinton.
# close

List of Hillary Rodham Clinton presidential campaign endorsements
↑ GayCityUSA Daily: Elizabeth Taylor Supports Hillary Clinton’s Run For President
# close

Senate career of Hillary Rodham Clinton
She subsequently voted against three of the nominees, but all were confirmed by the Senate. … ↑ Hillary Rodham Clinton.

 

I won’t use it… I keep it Google.

 

I accidentally typed in “who taugh Robert Johnson to play guitar”…instead of taught I misspelled it…”taugh”. PS asked me, “did you mean ‘tough’”? Maybe I am missing it, but if it doesn’t understand what I meant in the context of the question asked, how is it going to ‘naturally’ answer my question?

p.s. once I retyped…it still didn’t answer my question in the first 3 pages, let alone first three results.

 

@35

Both google, yahoo and even msft live seem to correct it and also answer it in the first query. :)

 

I tried using the product but the results are not relevant. Team needs to do a lot more work on the “relevancy” part. In current state, they can not compete against any basic search engine…forget Google!

 

Michael: On behalf of fellow “comment providers”, would you let us know what search keywords gave you the “AHA” moment?

I have not seen such a unanimous response from readers in any TC post for a long time!

 

Very underwhelming…a Google Search for “Golden Gate Bridge Name” vs whatever they were trying to demonstrate in that video, says it all

http://mbb.tumblr.com/post/34498545

 

Maybe Michael was searching

“Is Powerset going into the deadpool?”
“Should Techcrunch sell a plug to Powerset?”

 

You can’t compare Google the Giant with a start up. The searches in Google will obviously be better because they index faster and better. Instead of comparing the results, we should try and understand the technology or usefulness behind it. They are doing great work for a start up. Just add more servers guys and send those spiders to more and more sites.

 

These comments are excellent (although quite critical of the original post) and they illustrate what an amazing resource TechCrunch has become. I know it has been asked in other some of the other PS posts, but I haven’t seen a response:

Michael, do you have any financial interest in Powerset?

 

Michael and Techcrunch posted a couple of stories about my company (which I won’t name here) and they did it without being paid and without even talking to me…although they tried. The only reason I mention this is because I see a lot of people complaining that news on powerset is being posted because they paid Techcrunch. That may be true, but it may not.

 

Powerset must have some crazy foliage in it’s office. Those carnivorous plants can get nasty.

http://www.powerset.com/explor.....submit.y=0

 

it’s funny, there’s a twitter discussion about payoffs for posts going on twitter right now (around Scoble). it’s so ridiculously offensive to see comments accusing people of payoffs every time any blogger writes that they like a service. i usually delete these, but I’m leaving some of them up here as well.

Powerset is interesting. People who don’t see it as interesting (1) have a different opinion, which is fine, (2) are comparing it to google and its full web index, which is sort of dumb, or (3) are so entrenched in search speak that they forget real language can be a better interface.

Disagreement is fine. But if every time we post that we like a service you are accusing us of payoffs, just go hang out on another blog. It’s just trolling.

 

@Techcrunch Reader - of course the vast majority of companies TC covers are ones where an author has no interest - just like yours. The stuff today that brought Scoble here on the twitter thread is just stupid, and,I hope no one seriously believes TC is paid for story placement.

I think that MA can be fiercely objective, but I asked the question in #41 about his relationship to Powerset because I have seen it asked enough times here an never answered. A denial should end the matter.

while, I am here, check this search: who slept with Marilyn Monroe

Powerset tells me: Professor Frink, Art Buchwald and probably “not Elvis”

The second Google results tells me: James Dougherty, Joe DiMaggio, Arthur Miller, RFK, JFK, Joan Crawford (the last three being “unproven”)

 

try “when did earthquakes hit bay area” or “when did earthquakes hit san jose” or even “when did tsunami hit indonesia”, I was not impressed …

 

Maybe you should only allow video comments… That would shut a lot of those cowards up. I can’t believe you have to deal with this shit every time you write a post. I wouldn’t have the patience.

 

Actually, I don’t get it. Mike, honestly, this post is way too ‘drunk’ - bereft of any intellectual content. You say you have no financial interest in the company; but, this post and the previous ones taste and smell like PR plugs. I used to read TC religiously; but, you guys are seriously trying hard to shoo me away. Some objective, thought provoking coverage, please!!

 

Mike, you should have spent the time you spent writing this article on actually doing a few searches. I think you would have stopped covering this fail of a company after you tried it for 10 minutes.

Powerset is not bad, it’s atrocious.

 

Sorry, it did not live up to the hype.

This is supposed to be good for real questions, and will kill Google because it “reads” wikipedia, right? Wrong, it it a friggin demo and not as smart as a 5th grader or Google. You can try it yourself or see my test. http://www.fox.com/areyousmarter/features/

What Is the only US capital located south of Miami Florida? GOOGLE TAKES IT
Google – Honolulu (via Yahoo Answers)
Powerset – Article on Miami Florida

Is more water vapor held in cool air than warm air? GOOGLE AGAIN
Google – Warm air (wiki Answers) in Excerpt
Powerset – Article on water

Who is the only person to become vice president then president without being elected to office? GOOGLE AGAIN
Google – Ford.in excerpt #1
Powerset – Article on Vice President

On April 18, 1775, Paul Revere took his famous “midnight ride” across the territory of what present day state? SLIGHT ADVANTAGE GOOGLE
Google – Mass in excerpt #1
Powerset – Paul Revere Article

Does Tennessee border Missouri? TIE HMMM NOT BAD, LETS GIVE IT ONE MORE TRY
Google – Yes in excerpt#1
Powerset – Yes in article

What was the middle name of President Nixon? GOOGLE
Google – Milhous in the excerpt #1
Powerset – Article on Nixon

 

@44 and Michael. I just saw your comment and I completely agree with you (as I had just written in 45). The payoff for posts comments are both ridiculous and offensive.

In general, I think people are confused about which companies have offered you advisory positions. Having just found your disclosure statment here (http://www.techcrunch.com/about-techcrunch/), I apologize for asking the question about Powerset. Clearly you have no financial relationship with them.

 
 

The problem with the “it’s unfair comparison” defense “Google has indexed the whole Web, Powerset only has Wikipedia/Freebase” is that Google returns better search results when restricted to Wikipedia than Powerset. On test queries I’ve done, Google always beat Powerset in returning more relevant links to Wikipedia articles.

Now, you look at Powerset’s “factz” data where it has appeared to have attracted semantic assertions from text (Luke battles Vader, Leia awards Medals) and you’d think that searching for facts would return better results, but Google still wins (although it does a poor job).

Try this query, for which Powerset has actual “factz” extracted: “Who awarded medals in Star Wars”. The result is zero, nada!

 

Google will only take one day to make this kind of site. The site is very slow and the search result is very poor . That $12.5m should be given to Nargis Cyclone victim rather than building this “super-A.I” search engine.

 

@50: Excellent analysis! This is exactly what NLP is suppose to solve and excel at. Google is clearly smarter search engine and it probably does something like NLP in the back end. Great idea for a search engine test!

As for user experience, the tests clearly show that you can use Google just like you’d use Powerset. What exactly Powerset brings tot he table is beyond me.

PS: Has anyone found any queries that it performs better than Google?

 

and the only time this kind of hit job is done in the comments is when a person, or group of people, have some issue with a company. I’m pretty clear on what i like about powerset, and what it’s challenges will be, in the post above. like i said, i often delete offensive comments, but sometimes it is interesting to leave them in and digest the whole thing later.

powerset is trying to do something really big and really new. the kinds of people who respect that are the people I want reading this blog. The kinds of people who have such a sad life that they spend it tearing down anything that other people try to build aren’t welcome here. honest disagreement is welcome. but saying really dumb stuff like this could be built in a day…pathetic.

http://www.techcrunch.com/2007.....the-arena/

 

to quote Hyram Roth in Godfather II:

There was this kid I grew up with; he was younger than me. Sorta looked up to me, you know. We did our first work together, worked our way out of the street. Things were good, we made the most of it. During Prohibition, we ran molasses into Canada… made a fortune, your father, too. As much as anyone, I loved him and trusted him. Later on he had an idea to build a city out of a desert stop-over for GI’s on the way to the West Coast. That kid’s name was Moe Greene, and the city he invented was Las Vegas. This was a great man, a man of vision and guts. And there isn’t even a plaque, or a signpost or a statue of him in that town! Someone put a bullet through his eye. No one knows who gave the order. When I heard it, I wasn’t angry; I knew Moe, I knew he was head-strong, talking loud, saying stupid things. So when he turned up dead, I let it go. And I said to myself, this is the business we’ve chosen; I didn’t ask who gave the order, because it had nothing to do with business!

Mike, this is the business you have chosen. Anytime you like something, especially if it includes writing frequently about very small companies who just happened to appear at the TC 40, they’ll stone you. Is it bullshit? Who cares? It’s just human nature, man.

You have to ask yourself is there anything you care to change about the way you’re going about things. If the answer is no, which I expect it will be (and think it should be), just ignore it, forgive them for they know not what they do, etc., because the more you succeed the worse it will get.

 

@53. Indexing wikipedia 1000 times easier because it is a structured website and only contain informative article. When you indexing the whole website you will face a lot of SEO pages and doorway pages and also a thousand more “malicious” page. So there is no excuse when they can’t give good result for wikipedia search.

 

Whoever criticized Powerset for their design and presentation of results, I couldn’t disagree more strongly. The problem of presenting more data in a useful and intelligent way in search results is one that a lot have tried to overcome and all have failed at (including google with searchmash). I think what Powerset have done is simply stunning.

I’m really interested to see where they go with this, it’s been a long time since there was anything to be excited about in the search space.

 

I really like the UI. Who’s the creative guy behind it… i’m impressed!

 

Hope this is the beginning….Bcoz they need to go a long way to become prominent ones in ’searching services sector’(Because it will always be compared with sites like Google,Yahoo,Live,AOL).As of now,system is a bit slow and results are quite average(not well directed).

 

Robert - exactly. The comments have never affected our writing policy. Too many blogs go sterile trying to please everyone. I write what i think.

 

Mike its nice and all when companies want to be the next Google and want to give us superior search. But this isn’t it. So it sounds to everyone that you are either covering them for $$$ or because they have money. There are probably a hundred other startups trying to come out with new search…yet this company gets covered weekly.

I can see you sackriding them if they actually had good results..but they don’t. The search results are crap….and when you take into account that all they are doing is basically searching Wikipedia and that their product will have a very hard time scaling…you’ll realize that this product will never get anywhere. Especially when Wikipedia’s own search returns more relevant results…and don’t get me started at how much better Google’s results are

 

Kudos to Powerset for trying to do something new and solve hard problems. The issue isn’t that, I’d be willing to give them the benefit of the doubt. It’s the endless hype on TC and other blogs that causes the problem. When people start talking about killing Google, they are going to be more harshly criticized if their beta is underwhelming.

When there is constant banter about a startup being the second coming, it better damn well usher in the end of days. The seemingly weekly anti-Google “startup X is going to kill Google” comments don’t help.

Quite frankly, Powerset has been hoisted by their own petard. They may actually have some useful technology, but if they had been more humble about it, their launch would have been perceived better.

“Rumors” about huge acquisitions don’t help. If this was AintItCoolNews, it would bring immediate yells of “PLANT!” or desperate propaganda to boost valuation.

 

@Robert and 57. great quote and great analysis. I’d add Powerset’s bold “we are going to take on Google” mission and the efficacy of their PR spend adds to commenters’ “interest.”

That said, I take Mike’s point in #56. I think Google is good, but of course I want better search and I don’t care who provides it. Best of luck to everyone in this space.

 

To solve such a daunting problem, we must first understand how human process information; how we structure concepts. Only after that we will be able to answer, meaningfully, whether Flugly is a good name for a Hollywood actress.

This is why I believe in the long, arduous, slow approach. If we solve Bongard problems, then we are on to something. Take a look at this paper in the journal “Artificial Intelligence”: http://tinyurl.com/6duzfm

 

Hey mike, I never thought you can write such comment. You think you’re on the top of the world because of this blog? If you say yes you’re wrong because I came here because of visitor’s comment. You’re clearly poor in making judgement of anything. It not neccessary I mean it, when I write such thing. It just like any other comments. I just give my idea no matter it is dumb or not.

 

No good. Tried what is the line separating asia from europe

took 5 minutes to get results back and not a single relevant answer on the first page

there are at least a dozen articles in wikipedia that answers this specific question, e.g., “To the east, Europe is generally divided from Asia by the water divide of the Ural Mountains, ….”

 

This is junk. They’ll be in the deadpool soon. I’m left thinking: Is this IT?

 

“[PowerSet] is a complete shift. You see this and you want to experience all content in this way,”

“And, as an introduction, it will drive huge investment in semantic and linguistic technology, just as investments were made in information retrieval and scalable databases in the past. People working in this space will be very marketable.”

Barney Pell, co-founder and CTO of Powerset.

These lunatic expectations are what will throw them in the DeadPool in less than a year.

 

PowerSet is one of many startups who have tried to implement natural language search over the years. These have never got very far, and I suspect PowerSet will suffer the same fate.

From first principles there’s nothing wrong with what PowerSet are attempting. It would sound like a great idea even to a non-naïve investor, and the results for some searches I entered were not half bad.

But the key question is how much natural language search can improve upon keyword search. Time and time again, the answer seems to be not much. Understanding sentences gives only a minimal “information gain” over more basic keyword matching. In other words, by looking for pages where the search terms appear in close proximity, you get a good approximation of looking for pages which answer the question. Throw in stemming (i.e. recognizing different forms of verbs and nouns) and you’re 98% of the way there.

For the natural language search startup, this is the eternal dilemma. The only way to transcend it is to go much deeper into semantics. For example: try to assimilate general knowledge (e.g. Cyc) into analyzing the content and question, look for synonymous terms and phrases, recombine content from multiple sources into the answer.

But these are exactly the sort of challenges that AI, after 50 years of research, has barely scratched. It’s a commonplace in machine learning: no matter which methodology you try, you hit a limit on what can be inferred algorithmically. Every data set has a fundamental signal-to-noise ratio, which it seems only wet software (read: the brain) can beat.

In the absence of this kind a breakthrough, PowerSet will always be disadvantaged compared to straightforward keyword search. The latter can index and retrieve content much faster, since they have to do much less processing per document. Google’s genius was inventing a new approach to relevance (link citation) that yielded a huge information gain for relatively little computational effort. I’m yet to see a natural language search engine manage the same feat.

 

You see mike, both @69 and @70 said Powerset will be in the deadpool soon. It is not neccesarily true but it is what inside their mind. So, same goes with my comment.

 

Bye Bye Mahalo.

 

God, if I had a fraction of those millions of dollars for my research!

 

PowerSet,

We will always have Paris.

 

@Michael. I don’t believe that everything that isn’t critical on this blog is a plug. The coverage on this company is just too shady… the timing (looking for funding at the moment), lack of a decent product, over coverage, and under coverage of the product’s critical faults. It is just plain fishy.

Anyways, don’t be offended. People (like me) who believe there are plugs in TC, are obviously still reading your blog. You are doing something right.

 

Wow, I tried it, and it’s pretty weak. From their supposed strengths, I would think that searching “Which superbowls did the 49ers win?” would return the exact answer. Try it, not even close! Now, when I tried “Who had affairs with Bill Clinton?”, the second hit had the answer. However, that’s too easy :-)

I have to admit, I am hoping PowerSet fails. I met one of the founders are the iphone devcamp and found him to be an egotistical idiot.

 

@Michael,

This blog needs a “PageRank for People” or, i.e., a “TrustRank.”

Let the initial set of “Trust Makers” be yourself (since it’s your blog) and a few people whom you respect but don’t always agree with. A “trust metric” for blog commentators that allows the blog owner to define the initial set of trusted people would give you a way to rank replies, thus giving you the ability to demote ones from so-called “trolls” and promote ones from trusted people.

It is effectively a PageRank but for People rather than web pages.

If you have a couple million users, you need a way to filter shit (no pun intended)

I don’t think you can pick and choose who is welcome on your blog as that would be biting the hand that feeds it.

You can definitely implement a PageRank for People, a TrustRank of sorts.

But then you’ll have a Michaelopoly.

Maybe the viral nature of this blog is mainly due to the fact that you’re wide open for anyone to take a shot at you and you even purposely leave such posts for everyone to see, thus creating more of a mosh pit effect.

So maybe what you need is not order but controlled chaos, which you HAVE already.

This blog would not be as popular without these mosh pits.

There’s a great method to the madness.

 

@53:subset
You have a good point. But Google site restricted to wikipedia still has the benefit of anchor text from outside the wikipedia corpus pointing at it.
Which is all part of the challenges that face Powerset or any startup attempting to challenge a major search engine (even on a restricted data set).

A more complete web graph is a very useful resource for calculating anchor text scores, better pagerank estimates etc.

 
Michael Jordan Mother - May 12th, 2008 at 12:59 am PDT

Powerset is by far the WORST search engine. EVER.

Search for ‘michael jordan baseball’

Lame results compared to Google. What a waste of my 30 secs.

 

I kinda liked powerset’s approach and the results i got for my few searches. Was expecting enthusiastic comments and was very surprised at near unanimous opinion that it sucked.

Some queries I tried: powerset: when was gandhi born?
google: when was gandhi born? site:wikipedia.org

Notice the highlighted terms and the amount of highlight.

powerset: hitler and gandhi
google: hitler and gandhi site:wikipedia.org

I was looking for the letter and got it in the first result.

powerset: when did the beatles break up
google: when did the beatles break up site:wikipedia.org

Look at first and fourth result on powerset and compare this with google’s results. powerset’s is way better.

Someone suggested above “who is the prime minister of canada”.
Try this: powerset: “who is the prime minister of canada”
google: who is the prime minister of canada site:wikipedia.org

Powerset has the answer in sixth result. Google has it vaguely in eighth.
Try “who is the current prime minister of canada” and both do fine.

Of course it will have sucked at some queries as mentioned above, but its not a total crap as some make it look.

One thing I have noticed powerset not handling well is the “-” character. It counts it like a whitespace char. It interprets “non-violence” as violence, suggesting in facts, “Gandhi practiced violence” “Gandhi advocated violence” :P

And “when did the beatles breakup” doesn’t yield any good results while, “when did the beatles break up” gives what one was looking for.

 

interesting grouping of IP addresses associated with a number of comments.

 

Maybe there’s so much discussion here because people are passionate about search. I read Michael’s blog because I love new technology and anything happening in the tech world. Thanks TechCrunch!

 

There’s a disproportionate amount of manipulative, passive-aggressive and tit-for-tat behavior going on that keeps this blog growing.

Maybe what it needs is not a trust metric but a therapist…

 

@Arrington: Can you elaborate?

 

Hmm, wikipedia can do this very same thing.

It will feel good with wikipedia’s original website design and with less ajax use.

 

That was my first comment on your blog, Mr.Michael Arrington. Great work, will remain a fan.

(funny this built-in spellchecker from firefox did not pick up the word arrington by default lol)

 

It seems like people in this post keep on comparing powerset to Google. These people do realise its only searching wikipedia articles don’t they? @50….

 

really bad search.. period

 

Maybe we should compare Powerset with another startup…

Let’s take for example True Knowledge (www.trueknowledge.com).

Question: How old is Barack Obama?
Answer: 46 years, 9 months and 8 days old

Not a bad answer, is it? :) Now try that with Powerset or Google………..

 

Q:who directed titanic
A:James Cameron

Q:who directed titanic and terminator
A:nothing

I can’t see where nlp is.

 

God, if I had a fraction of those millions of dollars for my research!

 
Jack - A different impression - May 12th, 2008 at 4:31 am PDT

I’d like to point out some interesting PowerSet features I’ve noticed. Hopefully this will be more useful than some of the negative comments here.

1) Fact extraction from Freebase
I’ve noticed PowerSet tries to answer your question directly from Freebase if possible. Some examples illustrate this point:

- ‘when was claude monet born’ http://tinyurl.com/4f3p5k
- ‘when was impression, sunrise painted’ http://tinyurl.com/42mem8
- ‘who are rajiv gandhi’s parents’