Mozilla CEO John Lilly revealed more details of their stealth Data project today, which we first reported here.
In a blog post, he says “data is one of the most important pieces to faciliate understanding (and innovation), and is also one of the most under-explored areas of the modern web.” He also says that Mozilla has two early projects that touch on the idea - Spectator and Test Pilot.
The Data idea is much broader, however. “There are worlds of information about how people use the web that are locked up and not currently shared,” he says. By simply adding optional tracking software to Firefox code, much of that data could be unleashed. Mozilla’s goals with the Data project include:
- Collects & shares data in a way that embodies the user control & privacy options which are at Mozilla’s core.
- Enables everyone — from individual researchers and entrepreneurs (both the social and capitalist types) to the largest organizations in the world — to take usage data, mix it up, mash it up, derive insight, and hopefully share some of that insight with others.
- Helps move the conversation around data collection and web usage forward, to help consumers make more informed decisions.
As we said before, the project is still very early, has no name and Mozilla hasn’t “staffed it very much.” But the potential is huge. Tell them in the comments below and on Lilly’s blog how much you want this to happen.





now thats a slippery slope…
any thoughts on what the reaction would have been if this was done by the #1 browser provider?
valleyguy - its opt in. chill. don’t ruin this for everyone.
This could go a long way towards a truly semantic web, where peoples behavior validates the content and data. The biggest problem with semantics is that it’s easy to manipulate form the publisher’s side but this could become a darn good ‘filter’ if the users behavior doesn’t match - talking about really improving search…
Peter
follow me @ http://twitter.com/peterurban
#1 - ValleyGuy - I trust Mozilla to respect my privacy.. the same doesn’t apply to the producer of the most popular browser..
OH YES.. (rubbing fingers) give me that DATA… screw Adwords… I need to market my new-crappy-product to this select-group-of-idiots who browse these stupid websites.
COP - this isn’t a list of people who go to sites. It’s aggregated anonymized traffic data.
Even if they pull it off the data will be extremely biased.
First of all, Mozilla users do not necessarily represent most web users. Lots of younger people and tech people use mozilla but most older people just use IE. Imagine how biased a measurement service of young people would be.
Secondly, opt-in will further bias it.
This will be another Alexa-like throw your finger up in the air and guesstimate traffic service. The only company that addresses bias issues well and scientifically is ComScore and it took years of significant R&D for them to get it right. Yes they are expensive but they are also the best source for unbiased, accurate, in-depth web usage data.
OK - if this is one-way-only communication (from browser=>web stat database.ORG) then its all good in the name of science and transparency (hell i equally hate Alexa and Compete numbers)
If in any scenario, it would require any form of identification from myside.. (username, emails or IP) i would be scared as hell.
I trust Mozilla and would participate in it, but I would also like to use the data to analyze it and figure out how people use their browsers most.
This would be amazing. Great, reasonably accurate, global, FREE data. Good grief. yes, please “free the data”
. It’s been too long I’ve had to caveat and consider the flaws of current services.
On privacy - yes, a concern - but I trust Mozilla and think as long as data is anonymous, etc. - the benefits of having this type of access outweigh any concerns.
To the degree mozilla can incent/encourage a broad user base to participate would be great…don’t just want early adopter folks. If any demos/psychos are possible to layer on to the traffic info (optional data from users and anonymized) would be a nice bonus.
Collecting all the data you might want is fair, and fine. My guess is you’d do better by finding meaningful ways to scrape results from Digg-type sites, and public info from LinkedIn-type sites. This gives you semantics plus some inkling of values.
To borrow from another discipline: Neurologists and researchers need to know what goes on between neurons, but the doctors most people go to see look at much higher levels of physiology to determine health and provide valuable consulting to their patients.
In the case of the mondo-magoogly trackers out there, getting meaningful results from billions of searches and doc requests is more like focusing on interneuronal activity. Better for diagnosing certain problems, like routing, availability, and raw semantics, but not as useful in deriving social meaning, like ideas, preferences, and utility.
IOW, *Where* eyeballs point is much less informative than *why*.
We need both types of info, surely, but the higher-quality stuff is already being derived by clever sites organized around social networking and social bookmarking. I’d bet some yet-to-be-built site will do a more useful job of collecting meaningful information than could be gotten by stealth data tracking.
If I’m missing the major point of stealth data tracking, please weigh in.
If this can make better stats then alexa can produce, GO FOR IT!!!
All this sounds nice.
I have questions: Will the data be free? Will they really be available at all? How will the metrics collection/methodology address widgets and other embeddable elements? How will the metrics collection/methodology address ad tags? What does ComScore think of all this?
To me, a ‘news story’ like this needs some reaction from the ComScore/nielsen folks.
CG
How about concentrating on making a less shitty browser. :/
Spooky! A non public corporation that has no repercussions collecting data on mortals.
Sounds like information mainly for valuing advertising and valuing internet companies.
Could be a good marketing angle for Microsoft in the future - “we respect your privacy, we don’t try to persuade you not to opt out of tracking (even if it is anonymized)”.
I agree reader… pretty darn bold for them to do that. Gone are my Firefox days. Back to good ol’ IE. Although IE8 seems really cool.
As much as Arrington thinks this is a godsend, it’s not.
As anyone with any sort of background in analytics or just general statistics can see, it’s almost identical to Alexa except it’s run by Mozilla.
Ok. If you want to know how the specific subset of Mozilla users who choose to opt into this program browse the web, great. This will be the free service for you.
If you want more, not so much.
Mozilla can be trusted upon and hence it’s proposed data.
(As most of the traffic data analysis websites give a wague(not accurate) data analysis)
This is a (potentially) big deal. The naysayers above remind me of the Brendan Francis Behan quote: “Critics are like eunuchs in a harem…”
And props to MA for the profile in FastCompany. I had no idea you were such a big muckety muck!
@7 - firefox is not a fringe product. they are at a minimum of 20% penetration. higher on some foreign markets. its likely that globally they have 100 million+ regular users…and these are spread geographically, so you’ll be seeing a lot of sites you have never heard of in the stats (which is good).
The big question is how many users will opt in, and if the profile of the users who opt in is broad enough.
My initial reaction is that people who will opt in will be from the same group of users who sit in the valley, use twitter all the time (yes. me included). This group is definitely not representative of all the people.
That said, if there is one company that can actually make also the “normal” users to opt is it’s probably Mozilla, so we can all cross fingers and wish them success.
The challenge of course will be to give users enough incentives to opt in…
I know it’s opt-in… but usually Firefox users don’t like to be spied and use FF because it’s open source and therefore there is no danger for their privacy. By doing so Mozilla would do the same thing that we hate in the Google Toolbar when we ask for the “advanced” version.
Also remember the listing of queries done by AOL users that AOL was keeping in a really safe place but one day was on the web. And again, this kind of logging is very dangerous for everyone’s privacy, imagine you would have a (very nasty) login page that would contain “login=bla&password=bla”.
Oh by the way, maybe they will do the same thing as Apple did with the Safari installation included with the Itunes upgrade : someday, the checkbox will be already checked when you upgrade FF and you will send all your browsing history to Mozilla. But hey, it was a young developer who did this, he did not know what he was doing right ?
However all this information would be very useful, but again, this would limit the stats to FF users, which is not (yet) everyone.
It could be useful as a representative sample of the type of user that installs a third party browser, which might or might not be just as good as being a representative sample of the internet universe. I’ll have to chew on that…
I don’t want to “ruin this for everyone”, so how about:
No, thank you.
?
I think this is pretty cool. People browse internet and contribute to shaping it, it makes sense that people should be able to get this information back for themselves, in aggregate. It will help everyone.
Except of course the moment one person makes a decision based on it will be the moment another person will start stuffing the ballots.
great idea for mozilla to continue differentiating itself as well - can’t imagine msft/ie leading the charge on this one
Here’s an interesting question — what does Google think about this? Fair question to ask, since Google is Mozilla’s ad-revenue benefactor.
http://slashdot.org/article.pl.....11/2036218
Through all the cookies laid down by AdSense and DoubleClick, Google has a uniquely complete view of Internet usage. Is democratization of that information at scale in Google’s interest?
Your collected data will represent a very very small group of ff users. So, in other words… FF will be clasiffied as a false positive spyware application? jeje j/k
Jack #7 - I wouldn’t quite compare this to an Alexa which is an opt-in for a toolbar - and not a community user of a browser. As for managing a bias, I would argue that dealing with a browser bias is much more manageable than the bias of someone joining a tracked panel. Further, opt-in participation is largely driven by intent of data use.
I am assuming FF will be taking an open approach to aggregating data. One thing is for sure, data availability supports and fosters innovation - data coming from a black box does not.
MagNet is coming!
But Mozilla is on the right track -
Goodbye then Firefox, it was nice knowing you.
Thing is, this data is only really of any use to those who wish to mine it for monetary gain. Why would any normal web user wish to see what was the top website 5 minutes after a highly watched TV show? Or the web of browsing that explodes around a single tweet.
Any altruistic nature of “setting the data free” is therefore blown out of the water.
As others have pointed out, the data is very specific compared to the “stream of consciousness” that is gathered through the proliferation of google services on peoples websites rather than the browser that peers into them.
Also, gradual change can alter this to the point where it is similar to the nefariously omnipresent google toolbar - as a network admin, this application concerns me since many non-related applications attempt to stuff this down your throat (why is it packaged with cd burning software and iso mounting apps?).
What was once opt in gradually morphs into opt out and then into simply integrated.
We need to look for ways in which to fund development of the net and the many outbreaks of innovation without suckling at the teet of the advertiser and its trampling of the right to privacy.
Ask yourself, why should they know?
@ Paul Neto
Since when was Alexa opt-in? I thought the parasite got installed as default with IE.
If I recall correctly it is reported by Adaware and offered as sometihg undesirable.
Maybe I’m wrong.
The more I hear of this, the more I think of Phorm et al.
With this capability, all it would take would be for an untrustworthy little git from the bottom of the marketing foodchain to buy into Mozilla and it would really be stuffed.
Would you really trust your sensitive data to be available to someone with a demonstrable track record in highly underhand activity, even if they’s promised most sincerely that they wouldn’t look at it?
About as much as you’d trust a paedophile to run a creche no matter how much he claimed to be reformed.
Yes Michael
I REALLY DO PROMISE I won’t look at any data I’m not supposed to.
Yes, I know I have access to it.
Yes, I know that I COULD and that you’d never know.
I really do promise MOST SINCERELY I won’t
And you can’t see if my fingers are crossed.