Build Your Own Semantic Search Engine With Hakia’s APIs
by Erick Schonfeld on June 19, 2008

Want to create your own semantic search engine, but just don’t have the PhDs? Semantic search engine Hakia is opening up APIs to let anyone build their own semantic search application on top of its technology. Hakia looks at the meaning of Web pages and matches search queries using its own quality index of trusted, relevant sites.

Hakia, which competes with PowerSet and TextWise in semantic search, hopes to spread its technology by enticing developers to adopt it. The first partner is mobile app provider Berggi, which created a semantic mobile search app for AT&T and Sprint phones. I downloaded it to my AT&T Blackberry, but the application froze up when I tried to enter a search term. Maybe it’s my phone.

Under the terms of the API, developers and startups will get 30,000 searches per day for free, and anything above that will require a license. The different types of applications the API canbe used for include:

Web Search
News Search
Vertical Search (such as health or energy)
Text Summarizer
Text Categorizer
Text Characterizer (”Identifies and expands descriptive phrases or tags. Ideal for SEM”)
Text Meaning Representation

The only problem is that the link the company provides to get more information about its Web syndication services takes you to a sign-up page for Club Hakia. And once you register, the entry for Syndication Services says “Coming Soon.” Lame. Update: That link is now live and takes you here, but you still need to sign up. And there is more info now on the open Web at the Hakia blog.

If anyone manages to actually try this out, please let us know what you think in comments.

meaning representation of a given text block, suitable for core technology development

Comments

Typo in title: “Semenatic”

 

“Want to create your own semantic search engine”

And give the click through/landing page stats to another company, and free link backs for signups to their service then pay them to do so???

um… no.

I would try it out though, if I was paid by the hour to do so.

 

Sorry Erick but I have to pick you up on the misspelling in the headline. ‘Semenatic’. That’ll be some sort of search engine for impotent couples looking for suitable ‘candidates’ for their upcoming IVF treatment.

 

@3, ah hahahaha, semen, like sperm right?

ahahahah. I GET it. ?

 

Wow neat I think I will adopt this technology for my website. Cool I love new ideas.

 

With the emphasis becoming the “vertical” over the more general portal experience, this company sounds like it has some big opportunity for success. Can this engine also search and identify rich media - or is it only text? Entrepreneurs check out - I read it a couple weeks ago. Great stuff! http://www.readtheanswer.com/index.php?rta=blog

 

Hi Eric,

OpenCalais by Reuters is also a great service which I’ve been trying out lately. The API takes any text body up to 100,000 chars and returns contextual tags like places, people, companies, technologies and even medical conditions! You can post 4 requests per second or 40,000 per day. I haven’t had a chance to use Hakia yet but it would be interesting to see how they compare.

 
 

@6,

http://www.flickr.com/photos/8.....3/sizes/o/

Here is a screenshot to answer all your questions.

There is no automated signup for the API service, they have to “manually” approve you. This is a very bad sign in my opinion. Such as I wonder if it really exists. Just a hunch from years of dealing with startups.

 

Thanks for catching the typo. Wordpress really needs to have its spell check work in titles, or I need to learn how to spell :)

 

With the emphasis becoming the “vertical” over the more general portal experience, this company sounds like it has some big opportunity for success.

 

Is semantic search actually going to take off any time soon? Im not so sure it really is.

I also posted a recent article on Crenk about Quintura and their service. http://crenk.com/quintura-visu.....tag-cloud/

 

TECHCRUNCH NEEDS A SEARCH OF OLD ARTICLES. NOTHING AS FANCY AS THIS, BUT WHY NOT.

 

This kind of stuff is really neat - it’s good science. But when it comes down to users, the stuff just isn’t useful. Nobody cares that you extracted the named entity “San Francisco” from a piece of text. It simply doesn’t get the user anything.

 

“Not only is talking hands-free on your wireless phone a good idea for safe driving, starting July 1st, 2008 – if you’re 18 or older – California law requires it.* With this new law fast approaching, now’s the perfect time to pick up a hands-free accessory.

Follow the law all the way to your local AT&T store today. ”

I hope you guys cover this here next week.

 

he never has anything useful to say about the topic at hand and he’s usually pimping some new brilliant scheme of his.

That being said, I’m not sold on semantic search yet, but I definitely think tools like this offer a lot of value to apps other than search, because so much of what we do on the web could be enhanced through semantic processing.

 
 

I’m not pimping anything. There are 2 new laws coming to California that effect IT. One is a ban on cell phone use in cars, and the other is a tax on a huge industry in California, pornography.

I am just reminding them that these stories are important in case they didn’t know about them.

 

So, based on a (attempt to) read their documentation. I can (if approved) drive traffic to the Hakia search engine up to 30,000 times per day. Sometime in the future they may let me do some other stuff like term extraction. I’m allowed to do unlimited lookups of cartoons.

Sign me up.

I’ve been experimenting with Open Calais from Reuters. It seems to actually work. I can use it 40,000 times per day and more if I need it. It does named term extraction (though they call it something else) really well. Then I can build stuff with the terms myself rather than driving traffic to Hakia.

 

Clarification from hakia:
I want to thank Erick and those who have come to hakia to request API keys. We’ve had a very positive response, and are pleased to see there’s so much interest. I would like to clarify a few points though. First, anyone can have access to the API. We are asking prospective users to request a key in order to protect the service – not to filter anyone out. Also, the “coming soon” link has been replaced and is now fully operational at http://club.hakia.com/synd.aspx. Visitors can now directly generate hakia Search Box code to place on their site, or do live tests of the Web Services API. The test page includes the ability to generate general and vertical search results and perform page summarization by entering large text blocks or URLs. – Tim McGuinness, Search Services, hakia

 

Yeah it is pretty wide ranged when your pulling search terms from text. You can get into so many articles that really have nothing to do with what your were really searching for.

 

I’ve tried out the summarizer portion of the API (since that’s the most useful to me) and it appears to work fairly well. From the few websites I’ve tried, it gets tripped up slightly by menu bar junk and non-pertinent text, but not nearly as much as I expected. The documentation is still a bit lightweight, so you have to guess a few things to get the full range of functionality out of it.

 

If you want to search through any type of articles/documents, then clustering of search results by searchblox or vivisimo is better than viewing through a semantic search engine.

 
 

Leave a Reply

Create a Gravatar for your comments.
« Back to text comment