MySpace was the first of the Big Three to announce tools for third party sites to integrate MySpace user data into their services (called, collectively, Data Availability). A day later Facebook announced Facebook Connect, then came Google Friend Connect three days after that.
Today MySpace is fully launching Data Availability (look for it this afternoon at developer.myspace.com), and any third party developer can now build applications using their APIs. Google’s product remains in a test phase with a handful of sites (example), and we won’t likely hear more from Facebook until their F8 conference in late July.
MySpace is taking a much more interesting approach than Google, which controls data sent to third party sites via an iframe. MySpace is actually streaming data to these sites, which allows for true integration between the services, not just a bolted-on social tool.
Developers can access any publicly available profile data from a MySpace user and integrate it into their site. This includes a user’s name, picture, bio, social graph (list of friends), and other information. Users authorize the data transfer via a one-time secure OAuth login to MySpace from the third party service. The service is then allowed to access the data.
Since actual data is being streamed out of MySpace, they have a strict terms of use policy that forbids third party sites from storing or caching the data, other than the unique MySpace user id of the user. Each time a page is rendered the third party must re-request the data from MySpace via a set of APIs. That means any changes by the user to their MySpace profile data or friends list will be instantly applied across third parties who access the data.
Like Google and Facebook, users will be able to revoke access by any third party via a privacy control panel on their MySpace account:

Actual Data Portability, But No Syncing
This is a real move towards data portability, since MySpace is actually allowing data out of its server vault. The fact that third parties can’t store that data isn’t a perfect solution, since MySpace retains ultimate control of it (I discuss this problem in my Centralized Me post). True data portability requires constant syncing of data so that the users remain in control. But until real standards emerge on just how to do that (and there are some big hurdles), MySpace’s approach seems more than reasonable. This is a real step forward in terms of user data rights, and I expect we’ll see a ton of very creative implementations of Data Availability.
We are building a test application now and should have it live within a few hours. Look for lots of implementations over the next few days.





The Supreme Court today addresses social networks in DC v Heller.
JUSTICE STEVENS comments that right to assemble cannot be exercised alone. JUSTICE SCALIA clarifies:
“JUSTICE STEVENS is of course correct, post, at 10, that the right to
assemble cannot be exercised alone, but it is still an individual right,
and not one conditioned upon membership in some defined “assembly,”
And cue the crunchnow.com spammer!
Good stuff, keep it up!!
Let’s get beyond the HYPE Mike.
Look at this screenshot and tell me they are going to let this happen?
http://www.flickr.com/photos/8.....4/sizes/o/
DeWolfe will nuke the app faster than you can say cheese.
He will claim that a widget presenting MySpace users on SiteSpaces confuses people or something.
That is the reality behind some of these claims on Techcrunch.
I won’t even waste my time trying to implement what is supposed to be allowed by Mike’s report. It’s a farce.
I like all the new things myspace is doing.. only if they took the time to make the site faster I would be happy.
“The fact that third parties can’t store that data isn’t a perfect solution, since MySpace retains ultimate control of it ”
Ultimately what would be nice is to be able to mix and match friends from different social networks. This would be especially nice for me since my software is going to let people convert vBulletin and phpBB installations into Social networks with 1 easy conversion in the admin panel. Where their avatars become their default photos.
Imagine if all the vBulletins and phpBBs in the world became social networks then they could integrate with the bigger social networks via a TRUE friend connect.
What is happening here is just fluff. It’s not in the spirit of open social. Even friend connect is not in the spirit of open social because it’s not really open. It’s only open to Hi5 and other large players. Open implies GNU, FSF, and RFC, and other standards where anybody can build logic.
My 2 cents. BTW, My Social networking will prevail because it empowers webmasters and there are so many it will overwhelm the big players.
@ Chris Sack.
I’m not surprised it went over your head. You don’t even know what web site you’re on.
Chris Sack
Who are you calling someone a spammer you need to get Jesus in your life and become a better you brother.
Why don’t crunchnow and chris go blow each other in the corner and leave this site alone.
FYI, chris sack isn’t pimping some crap website, and so, is not a spammer.
This is a good move. I wonder though what other services are going to do with it if they’re not able to store any information other than the myspace id. It makes your apps reliability and performance totally dependent on myspace - not good.
@The Decider
Oh my and I’m the bad guy,with that language How do you live with yourself?
Rather than the walled garden castle lowering the drawbridge, this is more like opening the curtains.
What users should demand is convenient, secure, and unrestricted access to their data. That means the ability to have full interoperability between any of the tools and services they use, including operations like import, sync, and delete.
Service operators, if you love your users’ data, set it free. If they love you, they will not leave. Instead, they will appreciate the convenience of interoperability that you are enabling.
Data portability is a huge issue for the different social networks. After they created the user base, they don’t want to offer all that information away, without getting their share of the profits.
But whose data is it anyway?
I wish that you didn’t need the users permission!
I mean, why should I have to scrape a users public page? It’s PUBLIC! Just make it easier and if a users profile is a public, don’t require a password.
@9, are my posts wrong?
The paradigm here is to run their “people SDK” on their “operating platform, or OS” which is their website.
It’s proprietary, not open.
It’s the same paradigm as desktop OS’s like Windows 1.0 and it’s like floppy disk sdk.
I’m proposing a standards based solution that script writers can integrate in any type of script for p2p friend interchange between the servers of those running the software. All this is just a guise. The same guise Microsoft used. I used to write email software, and I had the benefit of having set RFC standards to go and implement my own email software. I made money on that. That’s how it should be.
this isn’t really a step forward Mike. Email addresses aren’t accessible by third-party developers, and even if they were, they can’t be stored. Shouldn’t the user be able to say “I want to take my social graph elsewhere”? with this they can’t. the user should be able to determine (a) what data can flow out to a third-party, including email addresses (the real gold of the social graph), (b) how long the third-party can access it, and (c) if the third-party can store it for future use. otherwise, it’s fluff.
Super stoked about Facebook Connect.
That is all.
@crunchnow
The tech/biz crowd is not so willy-nilly or concerned about “language”
>> lose the jesus talk
>> get out more
>> quit being a douche-bag
Now it seems like all the big boys are developing their own separate “Data Portability” initiatives.
Why not do it or collaborate together?
I suppose it all comes down to competitiveness, trust, profit and etc.
Now who is going to have a bigger pie on this?
Best regards,
Darren Lee
@16,
That seems complicated for typical social network users. I say if the user enters their login credentials on the 3rd party site, that means that now have total data access, including persistence.
If you start doing fine grain details of info sharing it will confuse most of these people as they are not only not internet savvy but downright internet incompetent for the most part.
Again this would require a new system.
Like say for example a user enters their MySpace credentials on Hi5 or another social networking site, perhaps mine or something.
They enter their login credentials, and there should be a way for them to simply press a button and import their entire profile into the new website without them having to recreate it from scratch???
That’s what portability is about.
Right, perhaps this is just a step towards allowing third party vendors to suck the data out of MySpace for data-mining purposes and knowledge discovery processes. These 3rd party vendors might be advertisers and online marketers. The thing is that opening up for 3rd party data mining might be illegal under privacy laws.
Humm
Not being funny but you were right on cue…
@22, why can Google take your AIM information?
How can sites import your friends to invite them and get their email addresses based on your agreeing to do so?
There is clearly a user agreement people can agree to that can allow a user to import data and use it elsewhere with their permission.
Waaaaah!
I want the ability to mine the data from the popular sites so I can claim larger member numbers and they won’t let me.
Waaaaaah!
This is cool, but it will not stop the great exodus from myspace.
@26, are you from MySpace?
You can already mine the data from any site with CURL and regex.
while(1) (
n=2
wget_implementation(”http://profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=n”)
regex(all_friend_data_into_sql)
n++
)
Forgive the pseudocode, askimet filters out brackets and semicolons.
That’s how hard it pretty much is. It’s not hard at all. A 10 year old could do it.
The trick is to do it legitimately and get people to know that they can simply create a single profile and have that travel with them on multiple websites without having to start over.
To let them know they can mix friend lists on their profile from multiple websites by agreeing to share data.
jeez I should read my own posts before I hit the button. The pseudocode should have been
n=2
while(1) (
wget_implementation(”http://profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=n”)
regex(all_friend_data_into_sql)
n++
)
But that of course is not a good thing to do. So we have to come up with a standards based solution where users control their data and agree to share it.
This is a sweet offering… if all of these major social networks are doing open data offering, at what point will they open up their networks and integrate with eachother? Is this only possible if somebody buys somebody out?… look… http://www.readtheanswer.com/index.php?RTA=web2
This is good news for Myspace trackers,
http://www.ferodynamics.com/my.....-trackers/
Arrington, surely you have a switch to ban an IP address - talking “chris” here.
#28, you’re going to do too many requests and get your IP blocked. Duh.
People, the real issue is this: Myspace can’t enforce the “don’t store, don’t cache” policy and it’s a given all the data will be sold off and circulated for various purposes, which I blog about.
Wow! Actual Seamless data synchronization with other sites? That amazing! I’d never thought i’d say this but don’t count myspace out yet, i wonder what else they are going to pull out of the bag.
data portability is just a social network fad
they are only ones doing it
lame lame lame
“#28, you’re going to do too many requests and get your IP blocked. Duh.”
MySpace profiles are bot friendly. Robots allows indexing them. So no. Not unless you announced that you were spidering them for a phishing purpose I don’t think.
“People, the real issue is this: Myspace can’t enforce the “don’t store, don’t cache” policy and it’s a given all the data will be sold off and circulated for various purposes, which I blog about.”
It doesn’t matter what you can do technically, it matters what you can do legally and ethically.
True friend interchange and profile data portability amongst pier sites, not just popular ones that formed some type of RIAA of social networking, can be done. It just can’t be done by Google or some other large entity with lots of self interests.
If I get lots of installs on my networking script. I’ll shoot to write a standard and implement it. The point is, the only people that could do a meaningful implementation of a standard to promote it are the same people who have an interest in not doing so.
@27, Ryan Merket seems like a shill for Facebook. Practically every TC post he injects some random FB comment.
Chris said…
You can already mine the data from any site with CURL and regex.
Actually no. What you’re talking about is not data-mining. You’re talking about data-sucking. Do you understand the difference?
Data-mining is to discover predictive hidden knowledge that is buried in the data, which is called knowledge discovery / predictive modeling.
#38
“Actually no. What you’re talking about is not data-mining. You’re talking about data-sucking. Do you understand the difference?”
I hear wikipedia is free to use now.
http://en.wikipedia.org/wiki/Data_mining
“Data mining is the process of sorting through large amounts of data and picking out relevant information.”
You may want to try it some time.
“Data-mining is to discover predictive hidden knowledge that is buried in the data, which is called knowledge discovery / predictive modeling.”
This is what happens once or shortly after the data is mined. This is called data processing. It is a different part of the indexing process. I guess the 2 overlap.
The point is the knowledge is not known to the min-er, so he is mining for it.
This is getting ridiculous actually. When most of the user data is exposed via HTTP and permitted to be indexed by robots, you can’t pretend that it is hard to obtain. Though normally it’s obtained for indexing web pages on the internet.
The point of this thread was data portability and freedom of users to use their one data set however they wish. We do not yet have that. I think GNU should come to the rescue with a standard here. I would implement it.
I looked at open social and it seems really client focused instead of p2p with different networks. I don’t think I am going to implement it myself.
Yeah, wikipedia is now an authority.
Falafulu, for all his errors, has nw established his superior knowledge in his space. Go back to comunity college Chris, the associates degree won’t cut it.
@41, I stand by what I said. Data mining’s definition is broad enough to including scraping for whatever purpose. Whether to create better search page indexes or simply to phish.
While data portability is of course a good thing, people may be overestimating the degree to which your average social media users care about being able to maintain one information profile across many sites. Just like in the offline world, people want to be able to shape their image depending on the social context. Their myspace profile is going to be different then their linkedin profile, which will be different than their digg profile, etc… So from a developer standpoint, I’m skeptical that integrating profile syncing w/ myspace, facebook, google will add much value to my site.
This is a big deal!
I suspect that the other major players will be forced to follow suit. Competing on “openess” what a concept? Sooner or later companies figure out that there is little, if any, value in attempting to prevent the inevitable. Anything that smacks of “lock in” will be resisted.
Why not attempt to define the machine instead of fighting it?
“It doesn’t matter what you can do technically, it matters what you can do legally and ethically.”
I’m not saying it’s right or wrong. I just know what Myspace users want, they ask me over and over, literally thousands of variations of these two questions that haunt them day and night, driving them absolutely insane:
1. Who is looking at me?
2. Who is doing something behind my back?
And if you think scraping Myspace is so easy, where are the URLs? I haven’t heard of anything. Myspace isn’t like Wikipedia, you can’t just download the whole thing and drop in your own banners.
Chris said…
You may want to try it some time
Chris, you still have no clue to what data-mining is? Scientific computing (numerical, mathematical, statistics computation) and data-mining & machine learning are my specialist domains and I am involved with the expert group that currently drafts the official Java Data-mining API version 2 (JDM 2) for the Java technology. In fact this is a topic that had been brought up in the expert group discussion regarding the issues of privacy.
Here is an example that is well known to you but eludes you that it is data-mining. Amazon recommendation is data-mining, did you know that or not? I am 100% sure that you didn’t know that?
Chris said…
This is what happens once or shortly after the data is mined. This is called data processing.
Yes, data-processing is what I said that data-sucking is part of it. You suck the data (retrieve) out of MySpace, but you haven’t mined the knowledge out of it. Are you still confused? There is no knowledge (patterns) that the raw sucked data reveals. You have to use Data-mining to reveal hidden patterns to you.
Chris said…
It is a different part of the indexing process. I guess the 2 overlap. The point is the knowledge is not known to the min-er, so he is mining for it.
Yes, chris we call it pre-processing or the formal language , is ETL (extract, transform, load). ETL is not data-mining, you have to differential the very concept of discovering hidden knowledge buried in the data (mining of knowledge from the data), and pre-processing of the data, they are 2 different thing. You can pre-processed data, but not mined it at all, ie, the data has just been cleansed (or some other preprocessing tasks done to it) for storage reasons such as data-warehouse is a classic example, but the cleansed data is not mined to extract knowledge. Are you still confused?
If you want to learn more about the subject, then you might as well download the most popular data-mining/machine learning open source project available today in Java called WEKA developed here in one of our local University in New Zealand. By the way, also buy the accompanied book by written by the lead developers of WEKA which is highlighted on their site (get it from Amazon).
Once you start to learn how to use WEKA or its concepts (data-mining/machine learning), then you would realize and ponder, Umm, Falafulu was right, that data-mining is extraction of predictive hidden knowledge that is buried in the data.
Go on Chris, learn data mining, it will widen your domain of knowledge and perhaps lead you to many cutting edge job opportunities out there. See, the business intelligence & data-analytic startups in Silicon Valley is booming, and they need people with skills in data-mining/machine learning domain.
Finally, don’t rely too much on wikipedia since it is not meant to be an absolute authority. It is a rough guide for the uninformed.
Correction to my previous post…
…you have to differential the very concept of discovering…
should read:
…you have to differentiate the very concept of discovering…
Interesting post and even more interesting comments.
“Amazon recommendation is data-mining”
Here’s something you do not know. I interviewed with Amazon.com’s page landing optimization team in Seattle and I have knowledge of their structure that you may not.
Amazon uses SQL to optimize page landing. The data is already collected from purchases, so it IS NOT data mining.
“You suck the data (retrieve) out of MySpace, but you haven’t mined the knowledge out of it.”
My pseudo code above included a regular expression line that simulated the extraction of profile data from each fetched web page iteration and storing it in a database.
That implicitly counts as mining. Sorry. Mining is not the process of creating graphs and other metrics out of data. Mining is the process of extraction.
Additionally, what you are describing is that doing a SELECT statement in an existing SQL with LEFT JOINs is in fact data mining. I retort that it is not because the data is already there. I would say that it is data processing, and only slightly overlaps on data mining.
The word mining implies that you are digging and extracting data from a new source.