Swivel Co-founders Dmitry Dimov and Brian Mulloy start off by describing their company as “YouTube for Data.” That’s a good start for someone trying to understand it, because the site allows users to upload data - any data - and display it to other users visually. The number of page views your website generates. Or a stock price over time. Weather data. Commodity prices. The number of Bald Eagles in Washington state. Whatever. Uploaded data can be rated, commented and bookmared by other users, helping to sort the interesting (and accurate) wheat from the chaff. And graphs of data can be embedded into websites. So it is in fact a bit like a YouTube for Data.
But then the real fun begins. You and other users can then compare that data to other data sets to find possible correlation (or lack thereof). Compare gas prices to presidential approval ratings or UFO sightings to iPod sales. Track your page views against weather reports in Silicon Valley. See if something interesting occurs.
And better yet, Swivel will be automatically comparing your data to other data sets in the background, suggesting possible correlations to you that you may never have noticed.
Academic types are going to go nuts over this. I spent a summer in college running regression analysis models on economic data. Being able to simply upload data to Swivel and then begin to slice and dice the data would have saved a lot of time. And being able to compare our data to what others were doing in related fields could have yielded results that we would never have aimed for. Big companies, small companies, thinktanks and non-classified government organizations are going to be similarly dazzled.
Swivel is putting significant computing power behind the scenes to run the data analysis. “We use farms of powerful computers and algorithms at the Swivel data centers to transform a lonely grid of numbers and letters into hundreds - sometimes thousands - of graphs that can be explored and compared with any other public data in Swivel.”
Not all data will be public. The companies business model is to provide the service for free for public data, and charge a fee for data that is kept private. Private data can still be compared by the owner to public data sets.
Look for Swivel to launch later this week after a year of quiet development. The company is based in San Francisco and is part of Minor Ventures.
Exclusive screen shots below:









This will also be the the Internet Archive for Data (rather than just YouTube)
it will be a valuable resources for future generations -seeking snapshots of our cultural trends
you’re right. I’m changing the post and title.
I think the idea is brilliant. If the execution is as good as I’m hoping, this could really lead to some phenomenal analysis. Stock forecasting, climate change modeling, even diet analysis…
Brilliant.
What a great idea. Love it.
Mike, thanks for the great article. Our investors say, ‘YouTube for data.’
we’re just obsessed with making data exploration fun and insightful. But mostly fun.
Execution, as Andrew Ritchie said, is what keeps us up late at night.
We’re hoping folks who are data freaks help us steer Swivel into something valuable.
-b
How about the google doing the same business?
http://www.ezecho.com
this is great, I think the way youtube implemented its embeding has opened a lot of peoples eyes.
Fantastic.
For everyone who ever read the wonderful “Augustine’s Laws”, Swivel is the digital incarnation.
ref for book:
http://www.amazon.com/Augustin.....1563472406
The moment I read this post the idea looked brilliant to me too as it has for the most of you. But as somebody also rightly pointed out above execution and its application is a important thing.
Would like to hear some thoughts on the following possible ‘breakers’ that I see,
1. Most data is typically tied to the ‘environment’ in which it is collected but uploads on this archive will be devoid of this environment. Without this critical piece of the puzzle how valid would the patterns that emerge be - how useful would they be ?
2. Data, whatever it might will have a zillion ways of representing it - both conceptually (graphs, tables, figures, plain text) and in computer formats. In my belief trying to extract any form of patterns must of course nullify the difference in formats but not spread across conceptual data forms.
3. Assuming that this idea really takes of in a big way and people might start basing their ‘decisions and choices’ on this huge internet archive and pattern generator. Is this necessarily a good thing to happen - am doubtful. Yes the collective intelligence of lots of people is probably behind my decision but my creativity is definitely lost !
okay, i’ll start the over/under bidding on this one… i say Google buys it in less than 6 months. i can visualize Larry drooling over this one already
Vote for Google acquisition on PollDaddy
hey Mike: do you allow polls in your comments? let’s see…
Vote the Over/Under on Swivel acquisition by Google
“Academic types are going to go nuts over this” and “Big companies, small companies, thinktanks and non-classified government organizations are going to be similarly dazzled”, are very big statements to make.
As a college students I know that none of my teachers will accept Wikipedia as a valid academic source and I don’t think they would allow this either.
It would seem that this data could easily be skewed by someone or some group with enough time and resources and some end in mind that promotes their own message.
Very interesting Read.
http://www.tekno-world.blogspot.com
This has potential for abuse by ideologues on the Left and the Right. From the Climate Change alarmists to the theory of Evolution deniers, there will be a very strong incentive to load this thing up with bogus data engineered to “prove” whatever pet ideologically-driven point the person wants to ram down the public throat.
Or maybe it will naturally balance out for the most part, in the way Wikipedia seems to be doing for the most part (politically sensitve topics aside).
It sounds like a brilliant idea and a huge challenge.
This is one of these times when I find myself thinking “gosh, I hope they make it!”. Not to get over excited, but the need is clear, the potential is huge, and the screenshots look slick.
Can’t wait
Mike, thank you for the excellent article. We’re getting real close now to taking the wraps off, and seeing all the great comments gets our cylinders firing. We do have a bunch of things to worry about, as Steve points out, but we also know that the folks like you who are passionate about data will help us get it right and bubble the good stuff to the top and weed out junk. We’ll see soon enough. Now back to working the final kinks out!
I dont understand what you guys are going ga-ga about..
1) If I wanted to upload my company specific private data, I would be very cautious and hesitant about it — just like I am with google.
2) Those academics who would be interested in slicing and dicing the data — they already use SAS software to get what they want to….
The 3′rd comment is absolutely hilarious — you might as well use them for the genome project..
Looks like one day we are in for a comprehensive database for earth and its various data’s and correlations. A complete fit for being called a collaborative web 2.0 Application.
Bravo
http://www.tekno-world.blogspot.com
Wow. That is very cool. At first when I was reading along it sounded a lot like archive.org, but then once I read “the fun part” I started to think of the possibilities. The funny thing about this is that now all those conspiracy theorists have a great place to go to come up with new conspiracies.
Imagine if you can make a correlation between UFO sightings and the number of bloggers in the area. I can see it now…
“Bloggers are really aliens!”
“Bloggers and aliens plan to take over the world!”
This is great. Sure there will be issues in execution, ensuring data quality, data privacy issues, and more. But this will certainly help in making everyine more dara aware (read Data IQ!).
Yes, if it comes out as advertised, it’s going to be bigger than YouTube and certainly more valuable in improving productivity…
Two thoughts on this:
1) I spent a great deal of time in college (and after college) studying and implementing data mining systems. There is a lot more to effective data mining than finding simple correlations between data sets. Yes, your website’s traffic could have spiked significantly every time the Denver Broncos covered the point spread on the road…but that doesn’t mean that the two data sets are correlated. This WILL be an amazing tool for amateur data analysis and should provide a lot of interesting results to small and medium sized busineses
2) I would be stunned if this has nearly the popular appeal of youtube. Uploading videos is something simple and fun…any one with basic video editing skills and a camera can upload videos, and anyone can watch. Data analysis doesn’t seem to have quite the same adoption. I think a lot of people will be able to come up with funny/interesting data sets and correlations, but I don’t know if it will have quite the viral appeal.
Dr. Phil,
I think you’re being too restrictive on who this would appeal to. The emphasis is on the sharing and community aspect of it - not just academics and SMBs. Just imagine what happens when people people on MySpace or their personal blogs want to post embedded data on their page. I could see politically active people doing this a lot with (as stated above) from both idealogies, or sports people wanting to keep track of their own stats (like in high schools), hobbyists tracking things like bird sitings - more people use graphed data than just companies and academics.
As far as abuse goes, that comes with the territory on public and community-driven sites. The comparison to youtube is very accurate - you can opinionate movies just like you opinionate data.
I’m very interested in the slicing and dicing of data - the cross sections could range from interesting to hilarious.
Swivel should be a feature of a web based spreadsheet app, not it’s own product, or its own company.
For me (a PhD student working in bioinformatics) this sounds potentially very interesting. It would be useful to share data for collaborative research, specially if they make it easy to access the data via APIs. I could even release alongside a publication the means to fully reproduce the calculations via this site. Others could quickly build on the publication with access to the data and analysis. I know this can all be done already locally but this would make it much easier.
Are you sure this kind of app won’t become sentient and take over the world?
Hi Hashim,
We kind of agree with you. Spreadsheets are great for editing data. In fact, we mention on our Web site how inspired we were listening to a NerdTV podcast with Dan Bricklin. He boiled down his idea for the spreadsheet as a _word processor_ for data.’ The problem we felt was that after we created a spreadsheet and wanted to switch to sharing and reading mode it wasn’t much fun. Yet you look at the rows and columns and realize there are only so many ways one can pivot and fold this data into graphs. Why isn’t there a Web site that jams out all the combos and then let’s me cruise the data. In one sense that what Swivel is, the read mode of a spreadsheet. Swivel is _Web site_ for data.
WOW! the implications for research are astounding…if the founders are reading this, i’d love to add one critical request (though it might already be in there) - user driven reviews of data integrity and reliability (e.g. ‘this data set has been marked ‘riddled with errors’ by 42 users)…likewise, will there be indications that the data set has come from an authority or expert resource (a la google coop)?
Like Michael Crichton says: “Science by consensus is not science”. Same goes for data analysis. While this like it might be fun and interesting, I would be wary of anyone drawing any concrete conclusions from any data analysis done by a community of virtual amateurs. Hell we can’t even come up with definitive conclusions on climate even though, “lots of scientists agree”. Same will go for any conclusions attempted to be drawn from this…uh….tool? What sort of things do we expect to get? “We compared the home run ratio of Barry Bonds to the voting record of Barack Osama, and as you can see, clearly there were no WMD in Iraq.
I think it sounds wonderful! If it does half of what I hope it will Swivel will be hugely useful for amateur sportspeople and hobbyists quite apart from people who are into politics, environmentalism etc who will be the main market.
No, it’s not another YouTube but there definitely is interest in this sort of thing as the success of Freakonomics and THe Tipping Point show.
Hi Dave,
We’re reading it. Half asleep on our keyboards at this point, but still chasing bugs and checking out TechCrunch. It would be great to get your feedback on this. Here is out current thinking.
We want to make it easy to organize user-driven reviews into two buckets. One is subjective. On the site we represent this with thumbs up or down. The other is objective (or tries to be) and we _will_ represent this with meters.
Subjective stuff is applied to, say, a graph of baseball stats showing the Yankees with more World Series wins than any other team, authored by a guy named Patrick from South Boston. The Title: Yankees Suck! Now, millions of pinstripe fans need to give this graph a big thumbs down. While all the folks from Boston…and Detroit, Cleveland, Oakland…give it a thumbs up. That’s spin and needs to be expressed because…it’s fun.
‘Objective’ stuff will be a bit more involved and will include measures of data accuracy, thoroughness, and attribution. That same Yankees graph with all the subjective thumbs down is also very accurate and would have a high objecive measure.
When we open the doors later this week the subjective stuff will be there, but for the objective parts we’re going to huddle with our founding community members in a few weeks (please join us, Dave) to figure out a way to do the accuracy stuff that engenders trust among data browsers and grows credibility data uploaders.
Swivel could be the missing link for my latest project. Me and the Junior Burrito Analysts have been tallying price and weight data for our meals, and have been storing it in a shared Google Spreadsheet. As anyone who intimately involved with Excel will tell you, their graphing engine is pretty ugly and is very poor at sharing charts online. Google Spreadsheets is better at sharing, but those screen grabs of Swivel look great!
Personally I don’t see Swivel having major implications in research or academics, and it certainly won’t touch serious analytics software. But for fans of Tufte, the ‘YouTube for data’ could be hours of fun.
To Hashim:
You are quite possibly correct.
In which case, it would be a perfect sense to have a company that rapidly develops the graphing and explores the best functionality interface and then gets bought by Google, Salesforce or anybody else with massive data sets.
And in a meanwhile, it will allow for quirky consumer uses that being as part of (say) Salesforce from the beginning it would not allow for.
As an associate at a macroeconomic research firm in New York- I think this site could be awesome and very useful in the finance world. I don’t know if you know the subscription prices for data sources, but they are pretty high.
One question though- will all the data be historical, or will it be constantly updated manually? Will there be features for large, popular data sets that are updated, such as index levels and government releases? I assume stock prices and indices will be filtered in automatically- true?
The concept is really good. Only I wished I could see it in action.
Well done. Another step in the right direction. Visualization and analysis are complex problems to crack and let’s hope they offer a service that does it well!
I like their model of “free if the data is public”. We’re planning something similar with Matson Systems, but not focused on visualization or analysis, but rather distributed organization and dataset building and maintenence for groups, communities and businesses.
Thank you, Joel:
“I would be stunned if this has nearly the popular appeal of youtube. Uploading videos is something simple and fun…any one with basic video editing skills and a camera can upload videos, and anyone can watch. Data analysis doesn’t seem to have quite the same adoption. I think a lot of people will be able to come up with funny/interesting data sets and correlations, but I don’t know if it will have quite the viral appeal.”
I agree. Are you available to help me set expectations with our investors.
I wish these guys the best of luck, and as a user of statistical/econometric software, I say the road is clear ahead: not only can you be the youtube of datasets: you can sell analytic capabilities such as high level regression or analysis capabilities (dynamic panel data with instrumental variables a la Arellano-Bover, anyone?).
These kinds of capabilities exist in very expensive software that you might use once or twice, even in a specialized environment. The situation is begging for a pay-as-you-use approach. I’m surprised STATA or MATLAB are not usable on the web on a pay-as-you-use basis.
Jonah, bring on the Quesadilla. Long live Tufte.
Excellent work - I can’t wait to see this launch. I have talked (much) more about this here:
http://saraewood.com/2006/12/05/swivel/
JJ Saenz , thanks for the well wishes.
I’ve talked about this on my blog as well at http://www.nonprofittechblog.org.
Swivel could probably help to crack the tough issues of transparency and accountability in the nonprofit sector.
To Swivel Team,
This is David from the EditGrid team. I believe there’s something we can explore together. We’ve an online spreadsheet which allows users to collaboratively edit their data.
Great idea, and I think labeling yourself as the YouTube of data is smart (you have to make data seem fun somehow). As everyone keeps saying, I do not believe this will become the mecca of serious data, but it has the potential for becoming the best for your everyday Joe (which is what really matters). What is really great about the model is the seperation of private and public data. The public will be challenging, but will certainly build brand awareness. The private will be a very useful tool and I believe if priced right could support the project.
As for subjective vs. objective data, I imagined a system such as that as I first read the article. Without that in place before the word spreads about the site you could end up with egg on your face. Having a potential user’s first impression of Swivel being a graph that clearly shows without a doubt that “Defense spending effects the quality of American Idol contestants” could hurt more than help.
I would CLEARLY MARK each graph with not only if data is objective or subjective, but also with the level of validity in the statistics. Maybe have a “Validate” button by the data on the site and have the graphs say “This data has been Validated by XX users.” This might not help initial traffic because website owners would not go create a “More pageview on this site leads to higher user income” graph, but would help the long term validity.
Matthew Leitz,
Great thoughts. Your warning is a good one. It’d be great to get your take after you take a look around.
Even Stephen Dubner of Freakonomics seems excited by this new service [http://www.freakonomics.com/blog/2006/12/05/a-youtube-for-data/].
Good luck gentlemen. Nice idea.
Brilliant! Death to SPSS!
A preemptive caution that I would like everyone to repeat after me:
Correlation is not causation.
most of today’s useful datasets are useful because they are huge. even if swivel would assent to paying the storage costs for multiterabyte datasets, how do you think i should import it?
what data formats do they accept? the data out there is dark matter partly because of the uncountable formats it is stored in.
The name is terrible. It’s part of an insult in English, isn’t it?
Actually, this seems to be the Myspace for Data. It goes further beyond Myspace and provides good tools and content for the user. The ideal Web 2.0 site.
http://mediavidea.blogspot.com.....thing.html
also, how does the inference work? you say you will draw correlations between disparate datasets…how? lets say i updload data in foxbase format for the number of bald eagles i see outside my window.
first of all, can you read foxbase? now i labelled my columns “no. of bald sightings”. what do you plan to infer from this automatically? i predict that like most column labels, this will be more or less opaque to you. you could try a NLP approach but this likewise seems intractable for a start up.
hmmm, sorry, but even a cursory analysis of the problem you want to solve doesn’t seem to inspire hope, although some answers to these trivial issues would shed some light