CrunchBase Now Has An API, So Grab Our Data
by Henry Work on July 15, 2008

Today we’re excited to announce a free, open, and easily-accessible API for all data included in CrunchBase, our tech company database. It is available immediately to all developers.

Since we relaunched the property five months ago, we’ve focused on accumulating and structuring the world’s most useful data about technology. And we’ve worked to make this data available in a variety of ways. For example, we’ve aggregated funding rounds and acquisitions, and we’ve built out maps and advanced search. This next step - opening up our data completely so that anyone can use it however they want - is not only logical but central to our mission as well.

The CrunchBase API is read-only and uses JSON for output. There are no developer accounts to sign up for and no throttling of requests. Just point your
browser to a special URL like this one (or curl it from the command-line), and you’ll receive pretty-printed JSON of all the data found on a normal page. The data includes company descriptions, geocoded office locations, acquisitions, executive boards, competitors, and much more.

API requests also include the permalinks (URLs) of other entities on CrunchBase that can be used to navigate across the database. While the API is released in beta, we’re versioning it so that integrations won’t break when we make changes.

We are still finalizing our data policies and terms of use, but we’ll be publishing all content under the Creative Commons Attribution License or something very similar, which means third parties are free to use it with attribution and a link back to CrunchBase. We’ve also made available some CrunchBase images (logos, iPhone icons, etc) for those who want to acknowledge us more visibly.

We’re really excited to see how the API gets used, so please let us know what you come up with.

You can follow CrunchBase on Twitter for updates about the API and other features (or just to give us a shout-out). We’ve also started a CrunchBase Blog where we’ll post regularly. And if you hate unstructured data as much as we do, come join us: we’re looking for a talented Ruby hacker.

Update: We’ve implemented several new features for the API: see the CrunchBase blog post here.

Responses (Trackback URL)

Comments

Comments Pages: [1] 2 » Show All

Techclusive.com is now going to JS company info widgets!!!

Gimme about 24 hours on this one. I have some serious code to do first.

I’m afraid since there is no retention policy, I’m going to cache the data in our database for quick re-entry.

Thanks for not making any rules on the use of this API. You guys are swell.

 

Does the API provide data on the latest entrepreneurial uses of ProVigil?

 

“which means third parties are free to use it with attribution and a link back to CrunchBase. ‘

Oh, no way. I’m going to just manually enter the data in the database by hand from several sources as originally planned. Oh, that just kills it right there.

 

How do you plan on enforcing that when you are letting people empty your database via JSON?

Just wondering.

 
 

EditGrid will be launching a data functions directory tomorrow. Crunchbase data included.

A preview here: http://www-staging.editgrid.com/data#11

 

Any chance of sharing the code for the TC widget? We’d be happy to include it, using the API, on our startup-related blogs :)

 
 

Jeremy - you can grab code for the widget here:

http://www.crunchbase.com/widget

Just enter in the companies, people, etc you want to display and it’ll spit it out. That tool is found at the bottom of every page (for future reference).

 

Techlusive you are everything that is wrong with this industry right no. Blatantly don’t care about the community, active spammer and only care about rules when there is some way to sue you to comply. get a life. Hat off to you TechCrunch, this is putting your money where your mouth is.

 

techlusive - the idea here isn’t to police things, it’s to build awareness of startups. If people are wholesale downloading data and publishing it without attribution we’ll likely send them nasty emails and point out how lame they are.

 

I love you henry work! <3

 

Great job nice work folks.

Is there a plan to create an API for adding/Modifying companies/People in the pipe line. Is it read only ?

I am hoping this will be free down the lane tooo.

Cheers, Nag

 

Mike - thanks, sorry for not checking crunchbase myself :)

 

Nag - the write API is coming shortly. I’ve asked the team for an AIR app using the API to write to crunchbase since I add so much data, for example.

 

This is very cool, and has great mashup potential

Looking at the API, I don’t see a way to search or get a listing of items in the database - is that coming as well?

 

What is the big difference between this API and interfacing with an RSS formatted page?

 

Is the CrunchBase vision to catalog information on technology companies only? I love the idea of CrunchBase, and would love to use for reference links, but I feel limited if it’s just technology companies. Any plans for expanding?

 

Greatest TechCrunch post in a while. This is the way to make the developers day. Being this is a blog that always talks about unlocking user data. Now we are need more Provigil :)

 

Great stuff for that CrunchBase API!

You mentionned the syndication of acquisition and funding rounds.. it’d be nice to have an RSS feed for those 2 pages…

Cheers :)

 

Michael - yeah those may be coming in the future, we’ll see how this first feature does first.

 

Rami - We’ll be using RSS for more change- and time-oriented feeds in the future, JSON works well for complex structured data.

Amit - We’re starting with the tech industry because that’s what we know best. There’s actually a lot of room for us to expand into within the tech sphere, in addition to the space outside of it.

Matt - “Data is meant to be free”

Martin - Good idea, and thanks!

 

Nice work. It is awesome TechCrunch is sharing its data. :)

 

nice… very nice..

we will have a use for startups data.. for people interested in startups on our end.

 

Michael / Henry / Mark,

Is there any chance that we can get a URL or domain search? So instead of

http://api.crunchbase.com/v/1/company/yahoo.js

I could do:

http://api.crunchbase.com/v/1/domain/yahoo.com.js

I would love to be able to get some of this data into Twitturly. I could see it not only providing some useful info to Twitturly users, but also allowing us to use it for some behind the scenes stuff.

 

This sounds great! I do have a use in my mind! :)

 

It would nice to have an RDF serialization as well as JSON. :)

 

Good deal. Looks promising, guys.

One feature request I think would be easy to implement and would add some great value is search by proximity. Input a set of coordinates (lat/lng) and search for the nearest 5,10,15,20 companies. Could easily throw this into a map mashup locator (i.e., search for nearest companies to XYZ address).

I am happy to help out if you need any help. The distance calculation is pretty easy and it can be done in MySQL or Postgres (assume you’re using one of those), allowing you to easily sort ‘by distance’.

Cheers.

 

one thing we’re trying to do as well is integrate directly with freebase and hand data back and forth.

 

Crunchbase has the potential of being a multi million dollar company on its own. Mike, I see the vision behind it :)

 

Great work guys! One question/request - is there a way to get a callback? Cross-domain JSON requests from the browser require them.

Flash also will need a crossdomain.xml file.

 

Nice! I second Joel’s request on being able to query by URL
ex. http://api.crunchbase.com/v/1/domain/yahoo.com.js

 

Mark, I assume Crunchbase will still publish trackback links from apps linking back, correct?

 

This will literally take Tradevibes out of business. Good job Techcrunch. There is a lot of potential in this area.

 

This is what an API is supposed to be guys.

 

Exciting news! Great job guys :-)

 

Re domains: yeah were considering search in general, so /v/1/search.js?keyword=techcrunch and perhaps /v/1search.js?domain=techcrunch.com or something like that.

Re callbacks: yeah we definitely want to get callbacks in there with a callback query param. Its another feature we’ll probably phase in as the api demonstrates stability. I’ll look into the flash situation, hadn’t thought about that.

 

Great work on the API, guys. Simple, clean — it works!

Mark McGranaghan: I’d like to put in another vote for the domain based search.

This would allow me to add CrucnchBase into http://www.WebsiteGrader.com (profiled on TechCrunch). It would be super-cool. to have this feature for our users.

 

This is a weapons-grade API and it makes me tingle inside!

How will you guys deal with misbehaving applications? I understand the zero-signup deal, but how will you contact people running buggy applications that accidentally hammer the server?

 

@AW: block them? just like we do with users on the website

why do things change so much as soon as you move from XHTML output to JSON?

 

@40:

Because developers expect error messages + etc when something goes wrong, not just an a simple CONNECTION REJECTED message that leaves them pulling their hair out. :)

 

How up to date is the data? If this is a read only api, nobody outside of techcrunch can make changes to the information. Without the update api, the information is static. e.g. Number of employees in a startup can go up and down very fast.

 

StartupWarrior ( http://startupwarrior.com ) is a CrunchBase mashup showing all of the startups on a single map video.

 

@41: its all in HTTP response codes

 

@techmine: you can use the form submit until its formalized as an API. there is a moderation process at the moment regardless

 

uh, what would be nicer is if you simply release the code used to create crunchbase for the dev community (or others) to build out new specialty sites and you can aggregate it ALL - like an “elgg.org explode module” for crunchbase…

 

Henry, Congratulations on the CrunchBase API. We will look into pulling the data into our upcoming RivalMap release so that our users can merge it with their own private company research and other data services.

 

you may want to post a crossdomain.xml file, which would give both flash & silverlight the go-ahead to make client-side http calls

(see the end of http://msdn.microsoft.com/en-us/library/cc197955(VS.95).aspx for an example crossdomain file)

 

Great to see CrunchBase opening up!

+1 for an RDF serialisation.

Really, really nice would be associations to the linked data cloud (e.g. links to companies in DBpedia).

 
 

Comments Pages: [1] 2 » Show All

Leave a Reply

Create a Gravatar for your comments.
« Back to text comment