Powerset
by Erick Schonfeld on April 22, 2009

Regular search engines such as Google and Yahoo use statistics to make sense of the Web. They count links, keywords, and other items on a page to determine its rank in search results. Semantic search engines try to actually understand the meaning of the words found on the Web and other documents to bring back the most relevant results to a query. Microsoft bought Powerset for $100 million to gain semantic search expertise, but so far all it can search is Wikipedia.. Hakia, Textwise, and other startups are also working on semantic search. Now comes NetBase, which brings a slightly different approach that its says can scale to the entire Web.

NetBase has been around for a while. Originally called Accelovation, it has raised $9 million in two rounds of venture funding over the past four years, has 30 employees, and counts among its current customers P&G, Caterpillar, 3M, BP, Kraft, BASF, and Goodyear. It is now changing its name and offering its core semantic indexing technology as a platform for other companies to build their own products. Already, scientific publisher Elsevier uses NetBase to power its Illumin8 research tool for searching scientific articles, patents, and Websites.

NetBase takes a sophisticated linguistic approach, actually diagramming sentences to determine the relationship between words and phrases. It does particularly well with causal relationships, allowing it to tease out cause and effect from raw text.

by Michael Arrington on September 17, 2008

Microsoft promises that this is just the beginning of the integration with the recently acquired Powerset, but incorporating better Wikipedia clips into Live Search is a far cry from the original promise of the next generation search startup: true natural language search.

Instead we have more Live Search results with dedicated answers (not sure why Powerset was needed for this), and better Wikipedia results (“Since Wikipedia articles show up in a large percentage of Live Search queries, it’s important that the captions are top notch.”).

Interview With Barney Pell and Ramez Naam About Microsoft’s Powerset Acquisition: Integration By End Of Year
51 Comments
by Michael Arrington on July 2, 2008

I spoke with Powerset cofounder/CEO Barney Pell and Microsoft’s Live Search General Program Manager Ramez Naam shortly after Microsoft’s announcement of their acquisition of Powerset earlier today.

Microsoft intends to use Powerset’s natural language search technology as a major differentiating factor v. no. 1 search player Google (see our recent coverage of Live Search Cashback, a another Microsoft search effort aimed at getting more market share).

TechCrunchIT goes into detail on how effective Powerset may be as a weapon. But a few things are clear – the resource limitations (cash and computing resources) that slowed Powerset’s development are now history. The relevance problem is less important since Microsoft core search relevance is quite good. And users really seem to like the beta launch of Powerset even with the limited dataset.

Naam says 5% of searches contain elements of natural language that keyword based search algorithms don’t handle well, and there’s an assumption that as better results are returned, more people may start to simply type a normal sentence instead of a couple of keywords. Microsoft will integrate at least parts of Powerset technology into Microsoft Live Search by the end of the year, Naam says. I expect we’ll be hearing a lot more about natural language search coming out of Microsoft shortly.

The full interview transcript is below, and you can listen to the MP3 over at TalkCrunch.
Read More

Ok, Now It’s Done. Microsoft To Acquire Powerset
109 Comments
by Michael Arrington on July 1, 2008

Microsoft will announce today that they have acquired San Francisco based semantic search engine Powerset. The acquisition price is not being disclosed, but our understanding from sources close to the deal is that the previously rumored $100 million is “roughly accurate.”

In May we reported that Powerset was in acquisition discussions with Microsoft and was hoping to bring another bidder to the table. Google was the likely candidate, but they have publicly dismissed the notion of contextual search as a revolutionary step forward. Microsoft, which is clearly interested in improving its search market share, turned out to be the best fit.

Rumors resurfaced last week about the imminent deal.

Powerset recently launched a showcase for its semantic search product, although they lacked the funds to do a full web index to prove out the product. As part of Microsoft, they won’t have that problem any longer. Now they just have to fight the bureaucracy to make sure the project continues to move forward.

The company had raised $12.5 million in venture financing, plus another $8 million or so in convertible debt as bridge financing. That means investors will get a decent return (but not a home run), and the founders and employees will also take some real money off the table.

We first covered Powerset in October 2006, and they were a TechCrunch40 company.

Update: Microsoft announcement is here, Powerset is here.

Microsoft To Buy Powerset? Not Just Yet.
62 Comments
by Michael Arrington on June 26, 2008

VentureBeat is reporting that Microsoft has agreed to buy semantic search engine Powerset for somewhere around $100 million, which is the price we previously reported was being offered to the company.

Our sources have been saying this deal is highly likely since May, but hasn’t actually been signed yet and could still be disrupted by the ongoing Microsoft-Yahoo negotiations. Dave Wehner, a Managing Director at investment bank Allen & Co. (he’s the guy who sold Bebo for $850 million to AOL), is representing Powerset in the deal.

Powerset debuted at TechCrunch40 last fall and opened a showcase of its technology to the public just last month.

Powerset has raised around $12.5 million in venture capital, and is rumored to have taken another $8 million or so in convertible debt as bridge financing.

Powerset Unveils iPhone-Optimized Wikipedia Search
21 Comments
by Jason Kincaid on June 18, 2008

Powerset, the natural language search engine that partially launched in May, has released a mobile version of their site that allows users to quickly search Wikipedia from their iPhone.

Since the release of the iPhone a number of sites including iPodia and Wapedia have released optimized versions of Wikipedia (though none actually made by the online encyclopedia). These sites reformat Wikipedia articles to better fit the iPhones screen while shrinking (or removing) images to conserve bandwidth.

What sets Powerset apart, and may make it the premier way to look up information from the iPhone, is the search engine’s ability to find both the relevant article and the exact passage that pertains to the search query. Even through the iPhone sports a relatively large screen, browsing through large amounts of text can still be a pain, which makes this feature even more valuable.

Powerset has lofty goals, aiming to use their natural search technology to overtake traditional search giants like Google. So far the company is only using Wikipedia for search results, so it’s hard to tell how well the technology will work once Powerset finally indexes the web, but for the time being it may well be the best reference tool on the iPhone.

You can watch a brief demo in the video below:


Powerset iPhone Web App Demo from officialpowerset on Vimeo.

Stealth Search Engine Blekko Gets Money From Marc Andreessen, SoftTech
34 Comments
by Michael Arrington on May 14, 2008

2008 is the year of the search engine startup. Hot on the heels of Powerset’s partial launch earlier this week, stealth search engine Blekko (no logo, no website, just this and, apparently, some technology) raised a second round of financing.

The company raised $3 million in equity at a $23 million post-money valuation. All previous investors participated, and new investors Marc Andreessen, SoftTech VC and Western Technology Investment also invested. They simultaneously closed a $1 million lease line with Western Technology Investment for server leases.

We don’t know much yet about Blekko, which was founded by former Topix founder/CEO Rich Skrenta. The company says they won’t be launching anything to the public until 2009. See our original post on Blekko for more background information.

See our coverage of Cuill as well, another hot stealth search startup we’re tracking.

Powerset Launches Showcase For User Search Experience
162 Comments
by Michael Arrington on May 11, 2008

Today marks another milestone for San Francisco based contextual search engine Powerset. They’ve launched a showcase for their user search experience – effectively the search engine minus the web crawl. For now, Powerset queries only Wikipedia and augments results with data from Freebase. The product launch comes just a day after reports that the company is being shopped to potential buyers by investment bank Allen & Co.

I have been able to test Powerset via their labs site for the last few weeks. I wrote about it last month, and the version that just launched is very similar.

There is no way to look at Powerset today and determine if it can be as disruptive to search as Google was when it launched almost a decade ago. That’s because it only queries Wikipedia, and so there is little need for proper ranking algorithms to sort the good from the bad results.

But what user can see is how effective a way it is to gather information quickly. For someone doing research, Powerset effectively removes a number of steps towards getting to the final information. It is particularly effective when the information needed is on many different web pages.

For example, a query on Powerset of “when did earthquakes hit tokyo” yields stunning results. Try this query at Google or even wikipedia to compare – instead of just picking out keywords that are in your query and on a web page, Powerset is actually making some sense of the content included in the wikipedia pages:

The way that Powerset returns queries means that answers are often found in the result snips, as above. They are also structuring a lot of the Wikipedia and (and already structured Freebase) data and inserting it into results. So a search for “Bill Clinton” shows results, but also shows Freebase structured data along with additional query refinements to get to more information. The important thing below isn’t the structured data in the results, its the fact that you can click on the action words and drill down into very specific queries (to find, for example, what bills he signed, or which Supreme Court justices he nominated, or who he slept with).

Powerset is indexing web pages much differently than normal search engines, which generally just record content to match against keyword queries. Instead, Powerset is trying to understand the content on the page so that it can be matched meaningfully to queries later. Even queries that don’t use matching words.

Indexing the web is expensive, though, and Powerset’s way of doing it requires even more time and computing power dedicated to a web page. That’s why they say they aren’t indexing the entire web yet – the company has raised just $12.5 million (plus another $8 million or so in bridge loans from investors). To index the web will require a new round of financing (see the first paragraph above about their sale/financing efforts).

Powerset is has taken a lot of criticism for their goal of trying to redefine how people search the web (including from us). But their lofty goals are what makes Silicon Valley so great – succeed or fail, Powerset is trying to do something pretty spectacular.

The company has also created a demo overview video – see below.

Powerset’s Dilemma: Go For It, Or Sell
109 Comments
by Michael Arrington on May 10, 2008

San Francisco based search startup Powerset will be launching shortly. For now, Powerset will query only Wikipedia and Freebase. But as I said when the product was demo’d to me a few weeks ago, it is compelling nonetheless: “When I tested the service I had something very similar to the “Aha!” feeling that ran through me the first time I ever used Google. In short, it is an evolutionary, and possibly revolutionary, step forward in search.”

But now the company may have to make a hard decision: sell now to one of the big Internet players looking for a point of differentiation in search, or take the risk of going it alone and possibly getting a huge, multi-billion dollar payoff down the road.

According to our sources, Powerset is exploring both options. They hired Dave Wehner, a Managing Director at investment bank Allen & Co. (he’s the guy who sold Bebo for $850 million to AOL, and is working on LinkedIn’s huge financing), to represent them in a possible sale or financing.

CNET is reporting today that Microsoft may be bidding for the company. According to our sources, those discussions have been going on for well over a month, and their most recent bid is “around $100 million.”

That probably won’t be enough to convince Powerset and their investors to sell. The big question is whether Google will step in to try and keep Powerset out of Microsoft’s hands, and start a real bidding war. That could drive the price significantly higher. Google, however, has publicly dismissed the notion of contextual search as a revolutionary step forward.

Whether that’s true or not is yet to be seen. But Powerset may find itself as a valuable chess piece in the emerging search war between Google and Microsoft. And if Google bets wrong, they could find their commanding lead in search eroded over time. A relatively small acquisition to keep Powerset out of Microsoft’s hands, even if just a hedging move, may suddenly be attractive to them.

Blodget Says Facebook Is Only Worth $9 Billion, Hypothetically Speaking
62 Comments
by Erick Schonfeld on April 28, 2008

sia-25-narrow.pngPutting a value on private companies is hard enough for insiders and venture capitalists who have full access to the company’s financial statements. When outsiders try to do it, even well-informed ones, it is nothing more than a guessing game. But it is nonetheless perhaps one of Silicon Valley’s favorite parlor activities.

Today, Henry Blodget & Co. at Silicon Alley Insider try to peg valuations on 25 private Web companies. Facebook is at the top of the list, but it is valued at $9 billion instead of the $15 billion that Microsoft’s investment put on the company. Why? Because everyone knows that the $15 billion is too high, so SAI decided to apply a 25X multiple on Facebook’s 2008 revenue forecast of $350 million. Does that make its valuation correct? Probably not. But in the absence of any true market pricing, anyone can go ahead and make a guess.

The same goes for any of the valuations on the SIA 25 list, which puts Wikipedia’s worth at $7 billion, Craigslist’s at $5 billion, Mozilla’s at $4 billion, LinkedIn’s at $1.3 billion, Ning’s at $560 million, RockYou’s at $325 million, and Spot Runner’s at $250 million. Note that three of the top five (Wikipedia, Craigslist, Mozilla) are essentially not-for-profits sitting on very valuable assets. The valuations for those three are based on what they would be worth if they were run differently with an eye towards maximizing revenues—which, of course, could impact how consumers interact with them, which in turn would impact their valuations.

Another 25 startups make up the contenders list, which includes Federated Media ($245 million), Yelp ($225 million), Meebo ($220 million), Mahalo ($150 million), Digg ($125 million), Etsy ($115 million), Powerset ($80 million), and Twitter ($75 million). A full list that changes dynamically every 20 minutes, based on changes in the Nasdaq, can be found here (although, exactly how the valuations are linked to the Nasdaq is never clearly explained)

Some of these valuations have more merit than others. Some have none whatsoever. For instance, SAI gets at its $125 million valuation for Digg by “splitting the difference” between a $200 million buyout rumor we reported and the $60-to-$80 million that Kara Swisher came up with. Splitting the difference between two rumors is not exactly the height of financial analysis.

But what are you gonna do? At least SAI acknowledges that the list is an imperfect work in progress. Don’t get too caught up in the actual numbers. It is more useful really as a starting point to think about relative valuation between different startups. Is Meebo really worth three times as much as Twitter? Is Ning worth as much as Slide? Let the parlor game begin.

Powerset Will Launch In Coming Weeks
66 Comments
by Michael Arrington on April 5, 2008

San Francisco based Powerset will be publicly launching a long-awaited beta version of the service in the coming weeks, the company told me yesterday. They are working on a new kind of search engine that will understand natural language searches and compete with keyword matching engines that dominate search today.

An early version of the search engine, which was demo’d to me yesterday at their offices, has been available to some users of their Powerlabs site. But for the most part, it’s been kept very quiet.

The early version of the service will serve as a showcase for the user interface and engine itself, but it will not have a full web index behind it. For now, Powerset will query only Wikipedia and Freebase. But when I tested the service I had something very similar to the “Aha!” feeling that ran through me the first time I ever used Google. In short, it is an evolutionary, and possibly revolutionary, step forward in search.

I’ll temper that statement since the company is not putting anything more than a tiny index of two sites behind the service for now. In particular, the fact that Powerset doesn’t have to bother with spam control and other relevance issues (which is what made Google so great when it launched), means it can’t yet be considered any kind of challenger in the search space. But anyone who uses it will be able to see the potential value of the engine when it is placed in front of a full web index.

For now the company is keeping specific features of the engine confidential, but I can say it has evolved significantly since a screen shot was released in mid-2007.

In preparation for the launch, some of the Powerset team have vowed not to shave until the product is released. They are chronicling their facial hair adventure on a site called Powerstache, which has been covered by Jessica Guynn at the LA Times.

Rumors have also been swirling around the company in general. A number of sources have said that Powerset is pitching for additional capital. And the company also appears to have put plans to hire a new CEO on hold – founder Barney Pell is still firmly in charge at the company.

Powerset is one of three new search engines that we’re keeping a close eye on. The other two, Cuill (pronounced “cool”) and Blekko, are still deep in stealth mode.

Microsoft Blews Brings Back Memories Of Rocket Pops At The Beach
22 Comments
by Michael Arrington on March 6, 2008

Ok, so that isn’t an actual picture of the new Microsoft Blews news aggregator that was announced by Microsoft Research today, but tell me that the screen shot (see below) doesn’t bring back memories of eating Rocket Pops on the beach as a child (or wherever you ate them).

But back to Blews. It’s a news aggregator (see Techmeme and about 45 others, including this gem), but it goes beyond mere clustering of stories to show what’s important right now based on who’s linking to what in near real time. Blews, which is only looking at political news, also tells you the bias of the links in to a story:

BLEWS uses political blogs to categorize news stories according to their reception in the conservative and liberal blogospheres. It visualizes information about which stories are linked to from conservative and liberal blogs, and it indicates the level of emotional charge in the discussion of the news story or topic at hand in both political camps. BLEWS also offers a “see the view from the other side” functionality, enabling a reader to compare different views on the same story from different sides of the political spectrum. BLEWS achieves this goal by digesting and analyzing a real-time feed of political-blog posts provided by the Live Labs Social Media platform, adding both link analysis and text analysis of the blog posts.

Here’s what all that looks like:

Liberal links are blue (rasberry) and on the left, conservative links are red (cherry) and on the right. The middle is the story itself in white (lemon). The dots around the edges suggest the emotional charge of the commentary, which can drip off of the Rocket Pop in very hot weather.

I note that no one on the team (Michael Gamon, Sumit Basu, Dmitriy Blenko, Danyel Fisher, Matthew Hurst and Christian Konig) is a user interface specialist or web designer.

Putting aside the UI, which is hard to do, the artificial intelligence behind Blews could be interesting. It is very hard to get a machine to decipher emotion and meaning from raw text unless they are doing mere keyword searches (see, for example, Powerset). Microsoft is calling this hard bit “detecting emotional charge.” If they’ve got it right, or are close, there are an unlimited number of potential applications for the technology.

As an aside, this somewhat reminds me of ScoutLabs, a startup we wrote about last December. Scout Labs helps brand marketers track commentary on their brands, and tries to decipher emotion towards that brand as well.

Find Something That Is “X” And Has “Y” With Circos
5 Comments
by Nick Gonzalez on January 28, 2008

circos_logo.pngKeyword search gets you pretty far when looking for pure information, but doesn’t help much on more qualitative searches like trying to find the hippest restaurant in SOHO. Searches like the latter rely on the opinions of people, not webmasters, which is one of the reasons Circo’s has launched their new qualitative search engine. The engine currently lets users search for hotels and restaurants by qualities like size, ambiance, or other qualities pulled from reviews from around the web. They have plans to expand to other categories in the future.

Circos is categorized under the ever expanding umbrella of semantic search engines, which currently includes the likes of Hakia, PowerSet, Kosmix, SemantiNet, Quintura, and TrueKnowledge. However, the engine is most like Kango, which has also taken on the task of categorizing hotels based on user reviews. VibeAgent also has a search engine for its own site that will search hotels based on qualities.

While Kango auto-generates tags after pouring through user reviews, Circo lets users search for any qualities they’re interested in. The engine then grades and ranks the results by each quality on an “A” through “F” scale based on how well the description fits for reviewers. For example, a hotel reviewers feel is spacious would rate highly if searching for openness, but poorly if you’re looking for a tiny room.

As with most search engines, Circos’ real test will be whether its application draws users away from other hotel and restaurant sites with less sophisticated search engines. Currently there are a bunch competing in the space. However, Circos says their technology can easily be extended to other categories since their algorithm does all the tough work of pulling the most relevant qualities from reviews. If hotels and restaurants don’t appeal, another category may hold their home run.

Circos is angel funded, based in San Mateo, and has eight employees (4 in Singapore).

The Next Google Search Challenger: Blekko
95 Comments
by Michael Arrington on January 2, 2008

Rich Skrenta, who created the first computer virus (Elk Cloner), co-founded the Open Directory Project, and co-founded online news site Topix, may have bitten off the biggest challenge of his career – taking on Google. In search.

Skrenta left Topix last June. He started his new company, Blekko, almost immediately, along with five others from the Topix core team. They raised $2 million in seed funding in September from Baseline Ventures, two early Googlers (David DesJardins and Jeremy Wenokur), and the founding team.

The company is still deep in stealth and, apparently, working out of a garage in true startup style (see image below). The Blekko website, which today has nothing on it except a picture of a puppet created by Skrenta’s daughter, isn’t even close to having a landing page up, let alone the final product. But eventually Skrenta says they’ll launch a full scale search engine to compete with the big guys.

Skrenta, who’s very media savvy, won’t say much about how he’s going to tackle search (he’s not a fan of PageRank though:“PageRank wrecked the web. Google is the cause of all of this. and Google is going down with it.”). He says they are looking at improvements on the back end (indexing and query serving) as well as the user search experience itself. Beyond that, he says we have to wait. And it might be a long wait at that. The company, Skrenta says, may not have a public prototype available until 2009.

Normally an entrepreneur announcing they’re taking on Google with a six person team and just $2 million in funding would either be laughed at or ignored. In Skrenta’s case, he has proven himself more than once as capable of taking on big challenges and winning. This will be a company to watch, and speculate on, in 2008.

There are other promising search startups out there. Powerset, Cuill (we’ll be hearing more about them soon) and the upcoming Wikia Search Engine are all yet to launch. Mahalo is growing fast (but still tiny). Can anyone unseat Google? Perhaps not any time soon. But you don’t have to get much market share to be a huge winner in this space – every 1%, they say, is worth a cool billion dollars.

Google’s Norvig Is Down On Natural Language Search
45 Comments
by Erick Schonfeld on December 18, 2007

googleogo4.gifDon’t expect to see natural-language search at Google anytime soon. Despite the buzz of startups like Powerset and, to a lesser degree, true knowledge, Google’s head of research Peter Norvig pooh-poohs the notion that people are clamoring to write full sentences in search boxes. In a Q&A with Technology Review, he says:

We don’t think it’s a big advance to be able to type something as a question as opposed to keywords. Typing “What is the capital of France?” won’t get you better results than typing “capital of France.”

True, true. But he does acknowledge that there is some value in the technology:

We think what’s important about natural language is the mapping of words onto the concepts that users are looking for. . . . To give some examples, “New York” is different from “York,” but “Vegas” is the same as “Las Vegas,” and “Jersey” may or may not be the same as “New Jersey.” That’s a natural-language aspect that we’re focusing on. Most of what we do is at the word and phrase level; we’re not concentrating on the sentence. We think it’s important to get the right results rather than change the interface.

In other words, a natural-language approach is useful on the back-end to create better results, but it does not present a better user experience. Most people are too lazy to type in more than one or two words into a search box anyway. The folks at both Google and Yahoo know that is true for the majority of searchers. The natural-language search startups are going to find out about that the hard way. If Google doesn’t trounce them first.

Founders Fund Closes $220 Million Second Fund
58 Comments
by Michael Arrington on December 17, 2007

San Francisco based Founders Fund launched in 2005 with a $50 million venture fund. They’ve had two liquidity events since then, and a handful of other very high profile investments (Facebook, Powerset, Ooma, Quantcast, Slide, Geni, Causes, etc.).

Today they will announce a second fund, Founders Fund II. It’s much larger – $220 million. And unlike the first fund, the money comes mostly from outside investors. The new fund will allow Founders Fund to make 15-20 new investments, including pro-rata investments in follow on rounds.

A couple of investments have been made out of the new fund, they say, but have not yet been disclosed.

Founders Fund partners have deep connections in Silicon Valley, which help with deal flow (Peter Thiel, founder and former CEO of Paypal, Ken Howery, founder and former CFO of PayPal, Luke Nosek, founder and former Vice President of PayPal and Sean Parker, founder and former CEO or President of Napster, Plaxo and Facebook). But they also approach deals differently than most other funds.

Sean Parker said today in a phone interview that a glut in venture capital, combined with reduced capital needs of most startups, has led to a shift in balance of power between entrepreneurs and VCs. Founders Fund recognizes that shift and has evolved does deals a little differently because of it. For example, they invented and promote the issuance of a special class of stock, called Series FF, which allows entrepreneurs to take money off the table much earlier in their company’s lifecycle. They also allow significantly more liberal voting rights to founder board members than many other funds. See this article in the SF Chronicle earlier this year for more on how they do business.

Powerset Looking for a New CEO
45 Comments
by Erick Schonfeld on November 2, 2007

powersetlogo.pngNatural-language search startup Powerset is going through some growing pains. Barney Pell is stepping down from the CEO spot. He will now become the CTO, and he and Powerset’s board will conduct a search for a new CEO. Powerset’s other founder and COO, Steve Newcomb, is not in the running for the top job. He has left the company.

At the Web 2.0 conference, Pell gave an impressive demonstration of Powerset’s search technology, although it was restricted to a limited data set. How the search engine will do against the entire Web, which is a much bigger technical challenge, has yet to be seen.

But this shakeup does raise a big question. Why step down as CEO and leave a huge leadership gap (with no COO either) before you find a new CEO to take things over? Perhaps this was done more for internal reasons. Announcing everything all at once sends a signal to employees about the direction of the company, and minimizes future surprises. The CEO search also indicates that Powerset may finally be ready to open up its search engine to the general public sometime next year. Or perhaps Powerset’s board has become impatient with the company’s progress and wants new leadership. You can read Pell’s explanation about the transition here. (You can read our previous coverage here).

Powerset Testing Search Results At Mechanical Turk
35 Comments
by Michael Arrington on October 21, 2007

A reader noticed that stealth search engine Powerset is using Amazon’s Mechanical Turk service to gauge user reactions to search results.

See the screen shot (click for larger view) – users are shown a query and a number of results and are asked to evaluate the relevancy of each result from five choices. In this case, the query is “revealing bikinis.” Users are asked to evaluate four sets of results within ten minutes, and are paid $0.02 for the effort.

The current batch of Powerset projects have run their course, and there are currently no other projects available on Mechanical Turk.

I spoke with Powerset CEO Barney Pell this evening who confirmed that they are using Mechanical Turk to get human feedback on search results. He says the results are not all Powerset generated – rather, they show results from Powerset, Google and others to see which users prefer for a given query. He also says this is an ongoing project, and new ones will be added soon.

Pell also said that Powerset plans to use Mechanical Turk over the long haul, even after launch. They’ll put actual user queries into Mechanical Turk in real time, add Powerset and competitor results and see which results people find more relevant. If results suggest Powerset isn’t more relevant, they’ll adjust their engine.

Powerset also uses the EC2 computing service, another web service offered by Amazon. They recently released some of their internal growth models that allow people to compare the relative costs of EC2 to building out a real data center.

TechCrunch 40 Session 1: Search & Discovery
27 Comments
by Duncan Riley on September 17, 2007

Session one as follows, including our live notes.

Powerset

mini-powerset.pngPowerset is a natural language search engine that can use everyday phrases and grammer to conduct more accruate web searches by understanding the search query and the pages it indexes. Parsing phrases and grammer theoretically produces better results because the egine has a better understanding of the searches intended goal than with just keywords alone. For instance, a Powerset search for “politicians who died in office” returns information on the subset of politicians who died in office, rather than a group of pages that ranked highly with the phrase.

powermouse-michael-arrington.jpg

Powerset presentation begins: talk about semantics and search, “we parse the web”. Natural language search.

Announcement: Powerset labs, where users can explore tech demos, share ideas, feed the learning engine and “improve your search karma”.

Demonstration of natural language queries with a social voting style feature. Touches of other sites

Demonstration of Powermouse (see screen shot), information is pulled from Wikipedia into a semantic index.

TC40 attendees will be amongst first in private beta.

Overall: tough sell in the search vertical, but interesting take. Great start to TC40.

powerset.jpg

Cognitive Code

mini-cognitivecode.pngCognitive Code makes artificially intelligent user interfaces. Their main product is the SILVIA (Symbolically Isolated, Linguistically Variable, Intelligence Algorithms) platform, which can add a human-like artificially intelligent interface to nearly any digital device. The SILVIA platform can learn and converse in natural language to carry out tasks for the user. Potential applications include children’s digital toys and personal assistants.

Flagship product: “silvia platform” Symoblically isolated linguistically variable intelligence algorithm. Laymens terms: AI.

Demonstration with AI on the screen, the AI system is having a conversation with one of the Cognitive Code. A couple of bugs in the live demo, but pretty cool.

Uses include embedding in toys, phones, websites “unlimited uses.” First major target market is “smart toys.”

Clever idea, if they can pull it off we’re seeing the future of toys.

CastTV

mini-casttv.pngCastTV is trying to build one of the web’s best video search engines by creating a rich index of contextual data about videos and an easy to use interface for searching them. The engine pieces together context for a video based on it’s metadata, the content surrounding it, and the content of pages linking to the video. Notably, CastTV also searches paid video searches such as Apple iTunes. Their user interface allows users to sort results by shows (to weed out non-relevant stuff), host (such as itunes, CBS Innertube, etc to focus on a favorite service provider), by date, relevance, prices, etc.

Presentation begins: CastTV doesn’t host videos, they index them.

Britney Spears video search compared, Google, Yahoo and CastTV: CastTV results are pitched as being better, more accessible etc

Colts Titans next example. CastTV is using smart clustering for results, pulling video from MSM and user generated content. Nice results, even if I have no interest in American Football :-)

casttv.jpg

FAROO

mini-faroo.pngFAROO is a peer-to-peer web search engine that has no centralized index and crawler. Each web page visited by users is automatically included into the distributed index. Ranking of search results is based on a distributed usage statistics of the web pages visited by FAROO users, which leads to a more democratic, user centric ranking. FAROO also shares advertising revenues up to fifty percent with its users. The search engine uses privacy-protected behavioral targeting to increase conversion rates.

Interesting concept, P2P in a strict sense. Results are only pages that have been visited by users…I cant’ help but think the SEO crowd is going to love this :-)

The presenter claims that the algorithms actually prevent manipulation: he doesn’t know the people I know. Nice results though.

Indexing via a desktop P2P client, demonstrated version on Windows. Faroo beta opens today.

faroo.jpg

Viewdle

mini-viewdle.pngViewdle is a white-label platform for indexing, searching and monetizing video. The technology they are developing lets video producers algorithmically extract metadata from news, shows, movies, and Internet video. This is much more effective than the old method of text-based metadata indexing. Viewdle’s most notable feature is their facial-recognition technology that can create a create a “real-time index of true on-screen appearances”. They plan on building one of the largest databases of people-in-video references. Reuters is currently testing out Viewdle’s technology with their videos news inventory by letting people search their catalog for specific people.

Demo starts with 2 minute demo video. Slick, we’ll see if we can get a copy.

The facial recognition is always an interesting concept, but I’m remind of Riya. More Britney Spears examples, although they are pulling data from others in the video as well, it looks a step from previous tech, particularly give it’s video they are scanning, not just pics.

Product: Top Chance, scans on criteria, including date. Popularity search includes total video time and when. Platform (presuming API) will also be released shortly to plugin widgets etc.

viewdle.jpg

Expert Panel: Ryan Block Chris Anderson, Marc Andreessen, Om Malik, and Marissa Mayer

First question Marc Andreessen to Powerset, great question, how do you break out, API’s etc. Good response.

Chris Anderson: what are the advantages of the various products to the user

Faroo responds first: we are by the user, for the user, it’s good because “they are doing the search together”

Om Malik to Faroo: most P2P systems people turn off, how do you overcome that, also how do you seed the network?

Faroo: it’s not a problem…not a particularly good response.

Marissa Meyer wants to know about the video search startups, scaling etc…classic :-)

CastTV: we’re scaling, focus. Viewdle “we reference a point” hence can scale to billions, using “fusion engine”

pane1.jpg

marissa.jpg

Discussion continues around AI and natural language tools.

om.jpg

Jason asks Om: which one is the most viable. Om: CastTV. One to last: Cognitive Code. Middle of the road pick: Powerset. Faroo is “interesting,” Viewdle will be “acquired soon”

Jason Calacanis to Marissa Mayer: will people switch away from Google. Reply: most people use more than one search engine according to stats. Google’s advantage is being a one stop shop. JC: what did you think of CastTV, MM: nice interface, clustering for duplicate issues is good tech.

ryan.jpg

Marc Andreessen: I don’t want to be obsessed with distribution…but I am, how do companies deal with it
Powerset: we’re very aware of this…uploading to users (???), embeding on external sites (Google custom search style I’d think).

Conclusion: speaking to Nick and we agree that CastTV was the winner in a very competitive group, good tech which just works with a practical use. Cognitive Code had the coolest product, but the demo wasn’t great which lost it for them.

Powerset Parses Miss South Carolina
42 Comments
by Michael Arrington on September 3, 2007

In a less than shining moment, Caitlin Upton, the 18 year old Miss South Carolina Teen, answered a fairly simple pagent question with a nonsensical answer:

Q: Recent polls have shown a fifth of Americans can’t locate the United States on a world map. Who do you think this is?

A: I personally believe that U.S. Americans are unable to do so because, uh, some people out there in our nation don’t have maps and, uh, I believe that our, uh, education like such as in, uh, South Africa and, uh, the Iraq and everywhere like such as, and I believe that they should, uh, our education over here in the U.S. should help the U.S., uh, should help South Africa and should help Iraq and the Asian countries, so we will be able to build up our future for our children.

Not one to miss a PR opportunity, yet-to-launch natural language search engine Powerset took a shot at parsing her answer so that queries could be run against it. Based on the query “Who does education help?” the index returned the result “Americans.” That’s an impressive result, given the nature of the data being queried.

The test shows the potential usefulness of Powerset as a search engine. The query does not match the content based on a keyword match, and the answer can only be determined via a contextual analysis of the data.

Powerset tends to look very good in demos against a limited index, as the above example shows. but it still has to prove that it can index and analyze large chunks of the web to become a viable competitor to Google and other search engines. That’s going to be their biggest challenge (and cost). Powerset still has much to prove as they prepare to launch.

bugbugbugbug
Techcrunch on Facebook