Evri Launches Semantic Content Discovery Engine In Private Beta
by Jason Kincaid on June 24, 2008

Evri, the site that uses semantic connections between terms to help users discover related information, has launched in private beta. You can register for an invite here.

Evri founder Neil Roseman (former VP of Technology at Amazon) is quick to explain that it is not a search engine. Rather, it helps users find related information by analyzing text to determine relationships between related terms. For example, a search for Barack Obama would likely yield a visual graph linking him to the Democratic Party, his wife, and other senators, along with a succinct summary of his background. Unlike the human-powered search engine Mahalo, Evri is powered by an algorithm.

The site made its debut appearance at last month’s D6 conference, which you can watch below:

Comments

Been there, done that, moved on. Trust me, nobody cares, or only a negligible number of people care.

 

Neil Roseman (former VP of Technology at Amazon) is quick to explain that it is not a search engine. Rather, it helps users find related information by analyzing text to determine relationships between related terms.

Whatever what one wants to call it, they’re pretty much the same thing. Online relationship matching, online product recommendation (eg, Amazon), online item popularity ranking or online site search do use the same algorithm, they’re only different in the domain of application. A good example here is the use of LSI (latent semantic indexing) in online product recommendation engine. LSI was primary first applied in text search search engine and from Neil Roseman’s description of his system, it is actually what LSI does, ie, match the relationship amongst terms in a corpus of documents to determine their similarities.

Now LSI has found applications in online sentiment analysis, image retrieval system (matching similar images), online product recommendation (similar to Amazon), online person attribute matching (ie, matching different people in a social network, etc… of how close their interests are).

So, Neil Roseman’s product is really search, but a slightly target domain from pure search, but actually, the algorithm is really a search algorithm and I wouldn’t be surprised if he is using LSI (or some variants of its, since there are a few variants of LSI).

 

If nobody cares about Evri, where does that leave Mahalo?

I see Mahalo as web 0.5 and Evri as the 2.0 improvement. Of course now Web 2.0 is probably more of a slur than an accolade.

 

Probably Neil Roseman was at Amazon by then when Dr. Ronny Kohavi the former head of data-mining at Amazon who was spearheading the development of their online recommendation engine. Kohavi now works for Microsoft.

 

like powerset, this one will be good for a single or few well-groomed demos but little else. graph representations are also good for demos and little else. google already does related terms. so does yahoo. it isn’t a game changer…see “google suggest”

that said, evri probably had no problem shaking a few million out of the mouth-breather VCs for this talks-good premise…and it was likely fun to build it

 

anyone remember Google Sets? put in “barack obama” and “hillary clinton” and click submit. there’s your evri.

http://labs.google.com/sets

 

I’m a little angry…at Google. Where is the future of search? Sure pagerank has probably improved considerably, but where is the JDM, Just Noticeable Difference. I have dreams and visions for such a company to execute and leapfrog search technology, browsing, interface and all of the above. Google is the DOS of search (ouch). Correct me but didn’t “I find what I’m looking for” drop down to 50% or so? I’m pretty sure this will drop as I find more relevant info on wikipedia (or powerset), Mahalo (go Jason), and possibly Evri. Dammit, show me the interface!!!!!!!!

 

i’ve been listening to this guy talk for two minutes, now, and i still have zero idea what this non-search search engine thing does.

brilliant.

 

He was not clear in how you can reach the website without a search box. Will this be an API embed in some portal?

Like Linkedin + Alpha (yahoo)?

 

Small world!

We launched a prototype “semantic search engine” on Saturday on ESer.org.

We’re using a variation of probabilistic latent semantic indexing called latent dirichlet allocation to search through a part-of-speech tagged corpus of the English Wikipedia. Our search engine is using the June ‘08 version of WIkipedia, which can be downloaded as a ~40gb XML file.

The entire search engine is running on a two-year-old dual-core AMD operton 940 with 8GB of RAM. It’s using a erlang mnesia database (the entire site is written in erlang) running on a RAID-0 array of western-digital 320GB SATA disks (those are at least three years old).

The hosting bill is about $15/month. The hardware was free :)

The total time investment to create this prototype was about three weekends worth of coding.

The only other expense was some pizza :)

I sure hope Evri didn’t spend too much time or money doing this…

We’ve had about 1100 users, each running about 3.2 searches, since we launched on Saturday.

However, I do not think that the best application for PLA, PLSI, or SVD is web-search.

This class of algorithms is probably much more useful for mining private datasets. Therefore, we’re currently working on open-sourcing our code (after we’ve cleaned it up a bit) under GPLv2 on Sourceforge. We think a debugged-implementation of semantic-search/algorithmic-recommendations might be much more useful to people who want to mine their own private MySQL/Postrgesql databases and/or private intranets (specifically, windows file shares) — datasets that lack the link graph metadata of the web.

We should have the source code for ESer.org available for download within the next few weeks.

Please let us know if there are any specific data sources (other than MySQL and intranet shares) that you would like us to include support for out-of-the-box.

Cheers,

ESer.org

 

who cares!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

 

Send this one to the NEXT bus…

 

It seems to be rather organized and will give more of an in-depth search so the users can discover more information related to what they are searching. I think people would be surprised. I like the idea.

 

latent dirichlet allocation

The codes for LDA (latent dirichlet allocation), had been freely made available to the general public in the last 2 years or so, see the bottom of this page for download info.

 
 

ESer.org said…
However, I do not think that the best application for PLA, PLSI, or SVD is web-search.

DLA, PLSI, SVD, ICA, SDD, NNMF (non-negative matrix factorisation), LLE (locally linear embedding) and so forth are content-based, ie, they use the word-by-document frequency matrix to find similarity, which they’re all different to Google PageRank which is link-based (hubs & authorities of pages). These 2 types of algorithms are based on different theoretical foundations. However the dimensional reduction algorithms (DLA, PLSI, SVD, NNMF , SDD, LLE ) can be used in conjunction with a link-based algorithm such as PageRank to enhance websearch results (ie, make it more relevant). Google is reported to have adopted LSI into its search engine to work hand in hand with its PageRank.

 
 

Are they using Calais for this? Not seeing much of a point here.

 

Parsec: Who doesn’t?

 

I think its pretty cool, and I think the technology has to be peratyy deep to achieve these type of results. I went to their Google page and clicked on the acquiring link. It shows me a list of companies that Google has acquired or is talking about buying like YouTube, Friendster, PyraLabs. Its actually pretty cool, cause I can see the latest info on these acquisitions. they have to be doing something pretty good here since they have to know those things are companies and then that they are being bought. I think its totally worth playing with.

 

As a non-techie, non-programmer/developer, but as a passionate web surfer and reader I am interested in content, connections and information and I have found EVRI quite interesting and fun. Just launched and admittedly limited so far, but I still had a fine time clicking on names, finding connections, reading content and just wandering around. I like the “most popular people/places/things” and the “Rising/Falling” elements. There is some real value, I think, in an approach that facilitates reading and “searching” and following connections that might not otherwise be discovered. I like it and hope to see and try more. Nice job.

 

Leave a Reply

Create a Gravatar for your comments.
« Back to text comment