March 13, 2008

Yahoo Embraces The Semantic Web - Expect The Internet To Organize Itself In A Hurry

Michael Arrington

103 comments »

Yahoo’s embrace of all things open continues today - expect an announcement in an hour or so that they are expanding their Open Search Platform that we wrote about last month.

In that previous announcement, Yahoo talked about their plans to allow third parties to alter and enhance search results with structured data that may be useful to users. Today, they’ll give more details on the developer platform and will announce support for a number of semantic web standards.

What does all this mean? It means we can expect the web to get itself organized, in a hurry. At stake is a significant amount of traffic from Yahoo search, and anyone else that may choose to build applications on top of this data.

Yahoo’s support for semantic web standards like RDF and microformats is exactly the incentive websites need to adopt them. Instead of semantic silos scattered across the Web (think Twine), Yahoo will be pulling all the semantic information together when available, as a search engine should. Until now, there were few applications that demanded properly structured data from third parties. That changes today.

One example Yahoo director of product management Amit Kumar and others gave me during a briefing yesterday is LinkedIn - were they to mark up user profile pages with microformats, Yahoo search could understand the content and relationships between pieces of content. Yahoo can then present that data in an intelligent way in Yahoo search. “With a richer understanding of LinkedIn’s structured data included in our index, we will be able to present users with more compelling and useful search results for their site,” Yahoo says. Here’s how it will look in search results (see our previous post on Yahoo Open Search for how this is implemented):

Any third party can create mods for Yahoo search that leverage their semantic data (Yahoo will be launching a beta program in a few weeks, along with a developer launch party). Some lucky ones will be added by default to all searches.

A few details are being disclosed now, and Yahoo promises more in a few weeks. They are saying that they will support a number of microformats at the start: hCard, hCalendar, hReview, hAtom and XFN. They will support vocabulary components from Dublin Core, Creative Commons, FOAF, GeoRSS, MediaRSS, and others. They will support RDFa and eRDF markup to embed these into existing HTML pages. Finally, Yahoo will support the Amazon A9 OpenSearch specification with extensions for structured queries to deep web data.

Erick Schonfeld wrote a post in February urging Yahoo to open up search completely to compete with Google. Yahoo isn’t heading in that direction, yet. But they sure look like they might get there eventually.

  • Sphere It

Trackbacks/Pings (Trackback URL)

  1. Sprawna Metoda
  2. Search Enginuity
  3. Yahoo.... - BlackHatCrew - Elite Webmaster SEO Forum
  4. Support this story on Stirrdup
  5. OpenSource Connections
  6. Alitora Systems
  7. viralmythen
  8. Information search: the semantic and the aggregated
  9. Iarfhlaith Kelly - Code agus Craic » Blog Archive » Semantic Web Links - Microformats
  10. 새로운 검색의 흐름. | The Humble Programmer
  11. Worum dreht sich der Hype um die semantische Suche von Yahoo eigentlich genau? » Beitrag » zweinull.cc
  12. It’s Official: The Semantic Web Is Coming | Chris Blackwell's Domain
  13. Yahoo! adota semântica, microformatos, otimização de sites e mudanças na web at Otimização de site - Web Marketing
  14. Pittsburgh Web Design » Cagintranet Web Design - Web Designer, Developer, Graphic Artist and Web 2.0 Guru » CSS Tip: Easy Semantic Web Contact Information
  15. Internet, Ethik und Lifestyle » Blog Archiv » Das semantische Web triff auf den Mainstream
  16. Computing at Scale » Blog Archive » Three Webs
  17. Breakthrough Ecommerce Library » Blog Archive » Content is Not King – it is Supreme
  18. Puzzlepieces – Yahoo Embraces The Semantic Web - Expect The Internet To Organize Itself In A Hurry (March 13, 2008)
  19. BuiltByDave.co.uk » Google Application Engine vs Facebook f8., by David Stone
  20. Yahoo! y la web semantica | Fausto Carrera
  21. information now » Blog Archive
  22. The New Yahoo: Sticky, Viral, And Most Of All, Friendly
  23.  — Instant Web Meetings.COM - Video Conference, Collaboration, E Learning, Unified Communications
  24. www.ubraniaroxy.pl » Blog Archive » The New Yahoo: Sticky, Viral, And Most Of All, Friendly

Comments

RSS feed for comments on this post.

  1. Matt

    THis is bigger than AOL/Bebo as far as I’m concerned….

  2. Matt

    Now people can finally get over Twine as a “semantic web” application… yahoo will show you all how it’s done.

  3. Jan

    simply. wow!

  4. YDrive

    Yahoo! will just continue to impress, and surprise, the world… good job.. great CEO too! 8-)

  5. Jay

    This is definately far bigger than AOL/Bebo. go yahoo!

  6. SearcH◆ EngineS WEB

    This is probably not as good as it seems.

    Wouldn’t Google have gone in this direction FIRST - it is could provide better results??

  7. Todd

    Awesome. Also big time validation for the individual crusaders for the cited web standards ( Chris Messina ) but…

    Won’t this all just be instantly abandon if Microsoft buys them? Like one millisecond after the deal is cleared, all support for open web standards is retracted? Open web standards do NOT trap people to the Windows operating system, in fact, they help free people from it.

  8. OpenDataWeb

    Welcome to the Open Data Web!

    Now we just have to enable the users to give the data meanings [that are machine readable] — viz., semantically.

  9. momoy

    nice. :D i like this move by yahoo. maybe they are doing this so microsoft will increase their bid. i mean it’s one of the logical reasons behind this sudden launch of new products and innovation by the Y! camp. i really like this.

    hope it gains enough steam to trample the evil google monopoly! go Y! :D

  10. Jan

    well, technically it’s no big deal. but the fact that finally one of the bug guys is pushing it, can change a lot. let’s see how long it takes until web developers annotate their sites by default…

  11. victor

    I thought LinkedIn already published microformats!

  12. Dyde

    This is nothing since it seems to complex atm. The same reason why XSLT and XML failed to achieve the hype. If someone comes with a simple solution that immediately shows its benefits versus the costs, then this would be game changing. However, I suspect that someone will be Google, and not Yahoo.

  13. Shweta Gupta

    Yahoo is all set to be back with a bang !! Good job Yahoo

  14. Ysync

    Now Yahoo! syncs itself up with the rest of the players… nice, at least technically.

  15. Michael M.

    This could be huge depending on what they mean by “support.” One of the earlier commenters is write: Complexity is the big barrier right now. But if they provide tools to help automate adding in semantic data, this could very quickly become a de facto standard which could make the web MUCH more useful, esp. for those data farming and mashing.

  16. Corey

    Come on Yahoo! You can do it. We all want you to survive and thrive. Keep innovating. Keep doing something spectacular. This is the way to go.

    The users will come. We promise.

    We all want to!

  17. drew olanoff

    Go go microformats!

  18. Omer

    Cool if they can get the seo woes right!

  19. Jibone

    So this is why Yahoo fired the whole designer team?

  20. Ben Metcalfe

    LinkedIn already marks up its pages in hResume, so they should be able to do this (at least with LinkedIn).

    It’s a great piece of news though, we’ll have to see what it looks like in the implementation.

  21. Ty Graham

    Is everyone drinking the koolaid?! Didn’t “web services” .Net etc… promise similar capabilities to ping sites for structured data? Seems like search results would be rather limited in the amount of data they returned. Still doesn’t make sense when “people search” companies do this kind of searching of social networks right now.

    Round and round we go, where the tech companies go no ones knows! Except me, because soon, everyone is going to get blip’d!

  22. Michael Kimsal

    I’m a bit more reluctant to believe the hype or promise of this. There are technical and human hurdles to deal with - semantically marking up data is hard, and humans can still get things wrong. Yahoo will still need to put in ‘best guess’ algorithms and such to compensate.

    But the bigger issue is why would someone like linkedin semantically mark up all their profile pages, at least for public consumption? It makes it that much easier for competitors to come and take away the one set of data that makes linkedin unique - the relationship data they have about their users. For me, what makes linkedin linkedin is the set of relationships (and to a lesser extent, what tools linkedin provides to exploit those relationships).

    Adding semantic markup to linkedin profile pages will make it easier for Yahoo to show more information. Great. But it also makes it easier for everyone, including Linkedin and Yahoo’s competitors, to scrape intelligently, and offer bigger/better/faster/cheaper.

    Now, there are certainly other benefits regarding cross-domain info linking - being able to better know the relationships between data across multiple data sets, for example. Again, good, but not great, imo.

    It’s certainly a chicken/egg situation, but I’m also not sure that’ll we have the same incentives that we did 10 years ago before the massive commercialization. For every argument for semantic markup, there’s gotta be at least one competing commercial interest against it.

    That’s my 2 cents as to why this will be an uphill battle.

  23. AllPortability

    @22 - it’s all about portability… notably, DataPortability — as you know… 8-)

  24. Jeremy Palmer

    Despite the challenges and criticisms Yahoo faces I’m amazed that they continue to innovate. Microhoo would kill all innovation and start cloning Google’s product (with a 2-3 year lag of course). Way to go Yahoo!

  25. Kapil Tundwal

    This news piece got me thinking - what if semantic print were to exist?

    What I mean by semantic print is application of RDF & microformats like standards applied to components that make up a print publication (flyers, brochures, newspapers, magazines, etc…) Just like semantic web promises to make search engines and mash ups better among others, semantic print would allow content repurposing, new packaging/print products and cost efficiencies in print production.

    Google’s interactive ads is also a step in this direction where by content also carries intent/action. On the production side there are a lot of print content management systems (Alfresco, Quark Publishing System, Documentum) out there promises this, but no clear winner exists yet in this space.

    I guess evolution happens out of chaos only and not every information in the world can be structured.

  26. Alex Rudloff

    All I can really say is…

    Sweet!

  27. Ryan Merket

    Wow. This is just awesome. Now the microformats I use will actually be indexed properly.

  28. Alex Hammer

    Can Twine compete with Yahoo? It may sound like a ridiculous question (and I am not clear what the overlaps will and won’t be) Amazon smoking Barnes and Noble etc. once seemed ridiculous, Microsoft over IBM, etc.

    Earliest adopters (large and small) generally have incredible odds against fending off later entrants. Think about Dell’s success for example and how it has been accomplished. It innovates very little.

    I haven’t seen Twine but I think Nova Spivak is a talented individual. And it will be interesting to see Yahoo’s effort.

  29. Brick Marketing

    This is great. This is definitely changing search as we know it and certainly giving Yahoo a much needed edge. Yahoo! talk these days is wild. If anything, it’s good PR.

  30. Steve Ganz

    Just for the record, LinkedIn has been a major publisher of microformats since 2006. We’ve been marking up our data with hCard, hCalendar, hReview, and perhaps most importantly, every LinkedIn public profile is marked up in hResume.

  31. Eric Blue

    This is great news!

  32. John McCrea

    This is great news. Together with Google’s support of microformats via the SocialGraph API there is now really good incentive for sites to mark up their public pages with microformats. Plaxo is another site supporting this with our recently released public profile pages: http://therealmccrea.wordpress.....c-profile/

  33. steve

    dumb question, but I hope someone can help. if Yahoo and Google do more to support mircoformats, does this mean we will all hear less about annoying “web 3.0″ companies like twine and radar? please let the answer be yes.

  34. james

    my web development dreams are coming true!

  35. whoopie

    watch microformats quickly become a spam vector

    we had this once before, called the meta tag

    then everyone put adult terms in the meta tag to juice ranking

    so the search engines stopped indexing meta tags

    now the problem has just been moved to a new format, in a year people will be lamenting how microformat spam is spoiling results

    sorry, humans cannot be trusted to describe their own data, this is why we have tech like pagerank in the first place

  36. spandana

    the SEO industry has now a major lease of life. kinko’s will be busy printing new business cards with ‘SEO Semantic Web Expert’ job titles.

  37. Harry Wang

    @whoopie - interesting spin

  38. Henry

    Blodget is the best! Yahoo to be bought at $35 a share by next week.

  39. Carlton Northern

    It’s about time.

  40. Semantic Technology Guy

    Good for Yahoo moving to the semantic web! More info here: http://semantisize.com/company/yahoo

  41. Chris

    As a developer using YUI, I would like to know if this support will be built into YUI and what the dateline for this is???

  42. tyler

    In response to:
    “This is probably not as good as it seems.

    Wouldn’t Google have gone in this direction FIRST - it is could provide better results??”

    You could imagine why Google would NOT want a semantic technology to be successful. How would ad revenue be affected by a search engine that returned ‘meaningful’ results? Less clicks? Less revenue?

    I understand that semantic search technology (magic) has not had its breakthrough moment, but…imagine the implications. Either we applaud innovation, or we continue to settle for a search like, “buffalo city new york please work”

  43. Eric Atkins

    Substitute the word Yahoo! for Google and re-read the story. Now, how do you react?

  44. Chris

    I would also like to know if there are plans to continue YUI on sourceforge after Microsoft takes over and kills it?

    Can you sign up for this anywhere?

    I am already modifying the library.

  45. ...

    I’m not surprised that Yahoo is the one that does this. They need to do something new if they want to take market share from Google.

  46. Lynn

    I’m with Michael K. As a corporation which builds databases with retail and shopping center locations, why would I want to make it easy for competitors to access this data?? I think the only incentive for me to provide this mark up would be if my site was listed higher in organic search results.

  47. Julian Bond

    I’m confused. Does this mean Yahoo will be marking up their pages on all their properties with Microformats? Or are they hoping to read Microformats on other people’s sites, index them and provide search into them?

    Just as with the Google Social Graph API project, all I can say is it’s about time. There’s a huge amount of structured data out there in FOAF, RDF, Microformats and so on. Reading it and aggregating it is a search engine sized problem. But up until the last two months the big search engines have just ignored it.

  48. Anne H

    I saw the demo Amit did last month in Santa Clara and it looked very interesting. I do hope Yahoo! makes their documentation clear for all users and not just developers. I think there are lots of sites run by non-technical people that would like to use them, but get stuck when they can’t find good documentation or support structures.

  49. william

    Yes….Great move by Yahooo !!!
    ….but what will the developers and the content creators receive in return for helping Yahoo have great semantic search ?

    Is this more web 2.0 share cropping that gives content creators and developers nothing in return for their work or their content.

    How come the content creators and developers that will be helping Yahoo dont get a piece of the action….so cash for their work…maybe even stock option….what a concept….contribution for raising the value of a company in exchange for pay or equity….the last time i checked this is how it is supposed to work…..Oh and Yahoo…if you want to be more “Open” how about switching to Lucen….Its Open, well documented and it works….You can out develop thousands if developers from around the world….Well Yahoo will not do this….and they will not pay developers or content creators…..

    And on the Bebo purchase….any of that 850 million going to trickle down to members that have made them what they are ? Doubt it

  50. Daniel Lewis

    Hi all,

    I think this is good news. Yahoo have always been quite good at looking into categorisation (see the Yahoo Directory ), and RDF is perfect for categorisation and representing true relationship links. They also have an excellent research team (see Yahoo Research ), which includes a few people from the Semantic Web and W3C communities.

    I am quite interested to see how this RDF & Microformats stuff works out, and how it might hook in to the Linked Data Cloud (see ). A lot of us in the Semantic Web are looking at finding ways to query the Web using a formal query language called SPARQL, and this is all part of the Linked Data project (see ), which is all about cross-domain relationships. I hope that whatever Yahoo does, it does it with Linked Data in mind.

    As for supporting Amazon OpenSearch, I am quite happy to hear this… its a nice little system.

    As for Microsoft’s attempt to take over Yahoo, I am quite glad they failed to be honest because these kinds of projects would have probably been terminated (although saying that, Microsoft are getting their washing their hands with some RDF ).

    As for the Google Social Graph API, its nice that they have noticed FOAF and XFN, but there is still a lot more that could be done and has been by other companies.

  51. Timo Paloheimo

    What the developers will get is more traffic from Yahoo.

    With semantic data Yahoo should be able to deliver better search results to the users and as the public notices it, their market share will grow thus delivering more traffic.

  52. Keasen Twijafur

    Microformats as the next SEO battle ground. Who’da thought…

  53. william

    Timo
    I think you are right….But this is not enough…..Yahoo pays it developers, and many of them have stock options. They all share in the risk and the reward. Why is it a different case for developers that will help with yahoo open search ? They will be adding value through their work….For me this means that they should receive payment or equity…I think that they should demand this if Yahoo uses their work…..The trap of all of this is that Yahoo is trying to treat the project as open source when in fact it is not….If a developer works on an open source project they are raising the value of software that is owned by no one and benefits everyone that uses it….In this search application is closed and proprietary; it is owned by a company “Yahoo”…and they will benefit from the work of developers monetarily in the form of a raise sock value and in high revnue from ad sales built on the backs of developers. Developers that work on this project should demand the same kinds of compensation for thier work, nothing less.

  54. Nova Spivack

    I’ve posted the Twine perspective on this here:

    http://novaspivack.typepad.com.....pecti.html

  55. william

    Interesting….is Twine going to sharing any of it revenue with content creators or api developers ?

    When you are bought or float for millions will you give any of this back to the community that has helped you to reach that point ?

    Your application is worthless without the contribution of content creators and api developers…Pay back some of this to those that will help raise your boat

  56. YDrive

    Being an online-storage-centric Personal Digital Library for the users, YDrive’s perspective on this is simple: who’s going to drive the semantic web, other than Yahoo!? The answer is also simple: You.

    And the rest is obvious… :P

  57. Mark Birbeck

    Julian,

    I think there are two sides to this. The first is to index other people’s data, and perhaps the most exciting development in that area is RDFa. This allows you to add any RDF properties you can find, to an ordinary HTML page…whether that’s a blog post, via a CMS, or whatever.

    The scenarios are simply enormous. For example, someone doing a PhD in chemistry could add little bits of RDFa to their blog and as the search engines process the RDFa, people can look for articles and research about specific chemicals. Given that a bunch of enthusiasts have improved the kite, just think what would be possible with that kind of information sharing!

    The second aspect is displaying the information to a searcher, and that may well be the battle-ground that Keasen refers to. Once you have this interesting data in your search engine, you need to show it in search results in a useful way. Events need to be shown on a calendar or timeline, locations need to be placed on a map…and yes, the chemical symbols from my example above need to be shown in glorious 3D.

    (You also need ‘actions’ on this data, such as ‘copy to address book’, ‘dial this person’, and so on.)

    All of this will take time to produce, but it will come. In the meantime, important breakthrough number 1 is that in RDFa we have a consistent and scalable technique for embedding *any data we like*, and important breakthrough number 2 is that the search engines are starting to take notice of this.

    Regards,

    Mark

    PS I’ve blogged about the Google angle on this here:

    http://internet-apps.blogspot......a-and.html

  58. Karl Engblom

    @42 and others
    this is an interesting dilemma for search engines:
    What if search technology improved to the point that you always found what you were looking for?
    First, there would be no point in ranking the results, since only one result would be correct.
    Second, there would be no point in clicking on the ads.

    It’s a very hypothetical question, but the point is that if search engines become too good at finding what we want, their profitability may go down. Google has made its fortune by coming up with proprietary methods for dealing with an unstructured web. They have no interest in the web organizing itself, it would take away their competitive advantage. Yahoo on the other hand have (almost) nothing to lose. So it’s not the slightest bit surprising that Yahoo and not Google are working on this.

  59. Joining Dots

    I blogged about this when Yahoo first announced it - http://www.joiningdots.net/blo.....click.html

    Whilst the idea is potentially great for users (until the spammers game the system), it challenges the ad revenue being earned by legitimate sites if enough information is provided in the search result so that you don’t need to bother going to the actual site.

  60. James

    Here’s a post about this topic: http://venturebeat.com/2008/01.....he-future/

  61. Kevin Burton

    Yup. The more standard adoption we see by crawlers/aggregators the easier it is to see this stuff adopted by publishers.

    The hosted CMS platforms have a big effect too. The recent Blogger changes have been killer and we’re planning on implementing them in Spinn3r. Specifically, full aggregation of archive content via Atom pagination.

    Kevin

  62. matt

    Yay go us!

  63. Sally Wu

    I think AOL’s aquitision of bebo portends the end of bebo…
    http://webpoet.wordpress.com/2008/03/13/bebo-aol/

    TWL

  64. Matthew Theobald

    Go Yahoo! Go Semantic web meets the deep web. (and Go ISEN.org!)

  65. Miracle Blade

    About time one of the big boys did it.

  66. panefsky

    From where do I remember this title?
    “Gates: We need microformats”
    good news

  67. www.fantasysportsmatrix.com

    can this be Yahoo’s savior?? I hope they take advantage of their excellent sports division and revolutionize sports search

  68. Eliot Spitzer

    I’ve also blogged about it here
    http://www.myblog.ultimatespam.com/spam

  69. thibaud

    What would be required of a small online publisher or a non-technical blogger to accommodate this approach?

    Does he need to add tags to his HTML code, or can this be done automatically for him by some third party?

    If the former, then isn’t the risk of gaming and.or spam going to increase exponentially?

  70. Matt

    So this is why MS wanted to buy them out?

  71. Eric Pugh

    Microformats are one of the most interesting “stealth” technologies out there. Once you install a plugin like Tails that exposes micro formatted data, like events or people, you start seeing all the places where the data isn’t structured!

    I’ve been working on a site called http://www.hightechcville.com that attempts to aggregate data about people, organizations, and events related to “High Tech” in Charlottesville, Virginia. We are leveraging microformats to pull in data, and exposing it all through microformats.

    So with Yahoo supporting microformats, we may see more adoption, and if Yahoo indexes and adds the microformatting then mining the web becomes even easier!

  72. Eric Shannon

    has great potential for publishers AND spammers! will be interesting to see who it helps the most. big or small publishers, honest or crooked.

  73. Bart Gibby - SEO Manager, OrangeSoda, Inc.

    As cool as this sounds like any thing, the implementation of this new technology will be the a key factor here.

    After that search users still have to adopt and utilize the new service. Which means change, and that means making new habits for the users. Which again is why implementation and re-implementating the solution is very important.

    I hope Yahoo! can pull it off. The search industry needs competition.

    Cheers, -Bart

  74. Glenn Engstrand

    I’m glad to hear this. Another recent heartening adoption of RDF is by the Reuters News agency. I blogged about this over at http://ploneglenn.blogspot.com.....c-web.html

  75. Personanondata

    There is a lot to get excited about in the Yahoo announcement(s) - and in reading the comments associated with the Techcrunch post others are interested as well - but perhaps the best thing to consider is that the weight of Yahoo will press faster adoption of some of these standards. In particular, microformats if adopted by publishers could/would change the rules of content syndication and lead to far wider distribution of publisher content. This in turn would lead to higher pass through traffic generating product or advertising sales for publishers.
    More: http://personanondata.blogspot.....c-web.html

  76. Sasha T.

    I really want to see their search results with all “new format” results. I think that it wont be so good as it looks on the first look. Maybe it is just me…

    Sasha

  77. Tom

    >Expect The Internet To Organize Itself In A Hurry

    BWAAAAHAHAHAHAAAAHAAAAAAAA! AHAHHHAHAHAHHHAHAAAHAHHHAHA!

    - wipes tears -

    AHAAAAHAAAAAHAHAHAHAHAHAHAHHAHAHAAAAAHAAAAA …

  78. chris

    Maybe this will break my co-dependency with google. Maybe not…

  79. Amy

    please excuse my ignorance - but as others have noted, how will this be different to the meta tag (and subsequent death of) ?
    http://searchenginewatch.com/s.....ge=2165061