All The Cool Kids Are Deep Tagging
by Michael Arrington on October 1, 2006

The popularity of rich media publishing (such as podcasting and videocasting, the YouTube phenomenon, etc.) is a problem for search engines and people trying to use search engines to find this content. The problem is that the traditional ways search engines index and rank content don’t apply to rich media because, well, it’s not easily indexable.

A few startups are focusing on creating transcriptions of podcasts and video content (see Pluggd and Podzinger, for example), which search engines can then index.

And many people are tagging audio, video and photo content. YouTube, Flickr and others allow this (and see Google’s efforts to tag photos using humans). Tags help describe the content and are usable by search engines as well as humans. But highest level tags, when they are present, don’t capture all of the content, so a lot is missed.

Figuring out how to search the meta data around rich content (tags and lots of other descriptive data) is big business. Truveo, a video search startup that launched in 2005 and was subsequently acquired by AOL for at least $50 million, helped solve this problem (but still falls woefully short of perfect). A new unlaunched startup, CastTV, takes rich media searching another few steps forward (much more on them in a later post). But even these new search companies can’t find all of the content in a video or audio file, and certainly can’t take you right to where that content is presented.

That’s why I like the idea of deep tagging. It requires human labor but for many publishers it’s worth it. Instead of simply being associated with a file, a deep tag is associated with a clip from the file. Click on the tag and jump right to that part of the clip.

We’ve covered a few companies that are facilitating deep tagging, such as MotionBox, JumpCut (acquired by Yahoo last week), Viddler and Click.tv. Also, Google recently added a captioning feature to video, as well as the ability to permanently link to any time spot in a clip.

Veotag is doing this as well (we haven’t covered them yet but a few commenters have pointed them out in the past). Today I received an email from Howard Seibel, Veotag’s VP Marketing. He pointed me to this page which is a better version of a TalkCrunch podcast I recorded last week with Om Malik and Robert Scoble. He’s added deep tagging, so listeners can jump right to certain parts of the show.

I like the fact that I can embed the Veotag player right into the TalkCrunch website, and people who listen to the podcast on the site can utilize the deep tags (right now we have a simple Flash player). I’m having our trusty analyst Nick Gonzalez look into integrating Veotag into TalkCrunch sometime soon. If you know of other startups addressing deep tagging, please let us know.

Responses (Trackback URL)

Comments

Take a look at the DiviCast platform, you can even add images, links and chapters to the podcast; and then embed the player right into your website.
http://www.divicast.com/channel/engadget (Demo)
http://www.divicast.com/help/ (Video Tutorial)

 

Very interesting and useful. key question though: can someone explain how the sections are tagged?
thanks

 

I think the point of this post was that the sections are tagged with text, which in Arrington’s opinioun is done well by giving humans tools to do so themselves. However, it can also be automated with varying levels of success.

 

Kris, yes, I agree. I think humans can do this with the right tools, and publishers often have the incentive to do so.

 

Mike,

this is a demo of our deep tagging player for “human” video segmentation
http://tags.lulop.com/player.php/7005
you can create scenes on this video or any other video on http://tags.lulop.com by using the commands on the player itself.

The player comes on top of our video platform http://lulop.com which supports branded channels, multiformat video encoding, advanced asset management options. On top of that we also do professional distribution to TV broadcasters, therefore we are not open to the public

An open source version our platform is released on http://lulop2.sourceforge.net. No human segmentation player there, but there an automatic segmentation module made available here http://makeclean.iobloggo.com/archive.php?eid=272

Nick can contact me anytime if he wants to try it out

cheers

Lorenzo

 

In The Netherlands, where I’m from, (most of) the public (non-commercial) broadcasters on TV have subtitles for people with hearing disorders. These substitles are timecoded. If you combine the textfiles from these subtitles, you can search within video-footage, and find everything from a show where people are talking about, say, “fish”.

You can check ik out, working quite good, at this url:
http://boerzoektvrouw.kro.nl/v.....006-5.aspx

(Use the link that says “Bekijk hier uitzending 5″.
A popup will open with the programme shown, and on the right, you have your search-box. Press “zoek” to show the results)

Of course you have to know Dutch to make the fullest out of this feature, but try “Stockholm” for instance.

Disclosure: I work for the company that coupled the subtitles and timecodes for the search, but hey… I thought it is pretty cool to share with you. The method isn’t waterproof, though it’s a smart, quick &dirty method to search within video-footage.

Laurens

 

The site digg has proven that by giving users the power to drive a news site won’t be a failure. Kevin Rose started a revolution of user driven sites when he launched digg in 2004. Since there there are many other sites that has spinned off such as http://www.techtagg.com and http://www.netscape.com. Tags are growing and people trust what others recommend. Falls back to word of mouth marketing. It works.

 

Could this lead to a possible entry level job at these content aggregating companies. Tagger Level I. or will the community continue to donate their time to tagging?

 

I think human content management jobs like “tagger level 1″ will definately be part of the new economy. The fact that there are “gold farms” in china that sell player character traits and is reputed by wired magazine to be a huge industry employing thousands of workers all over the globe is a good indication that human tasks will always be a major part of manipulating, aggregating, sorting and presenting content… no matter how many automated systems we develop.

 

There is a company out of north carolina (more like group of friends) that has developed a set of system level extensions for your desktop apps (web and productivity) and some web apps that manage the social side of it. The system uses an iPod and your mobile. It’s rad (shameless plug). :p

 

Hello!! Has anyone heard about a company called “Autonomy”. They are on the enterprise search market a lot of years now. They specialise in search (duh!) and also have a product called “Virage” that has the capabilities that this article is all about: http://www.virage.com/home/index.html
They index live videostreams into textual files and indexed that (including timeframes etc.) so that you can get that one piece of information from a broadcast.

 

We decided to leverage the social participation of the listening/viewing audience with Innertoob.com because it seemed much more interesting to allow USERS to create conversations around their favorite shows. Users create “Time-Posts” along the timeline of audio/video files on the web, and in good web2.0 fashion, share their comments via rss, links and even embeddable players for blogs and myspace.

As a podcaster, I’m already exhausted at 3am when I get my podcast uploaded and published. I don’t want to spend another three hours creating enhancements. I’d rather let my audience discuss the podcast, add links, create show notes, ask questions, etc. I like the idea of interactive podcasting.

 

Hi Mike,

We have been addressing exactly that problem with a product that is in fact called Deeptag. I have actually just come back from dinner with Sam Sethi where we’ve been talking about a possible collaboration.

Search as it’s currently done is a twofold problem 1) Determine which content relates to your search (keyword association) 2) order that content (relevance ranking).

I think that Deep Tagging is going to become one of the most powerful cues into the first of those problems (and has in essence already been albeith through a-links rather than tags) but it won’t succeed until it’s brought inside user’s workflow.

As Joshua Schacter says, anything that relies on people doing things for the good of the community will never succeed. Software has to work for the individual and only once it’s done that can it hope to generate a higher plane of value within the community itself.

 

ITP Research has had a video commenting Wordpress plugin for awhile. It works with quicktime. The Video Comments plugin site is:

http://itp.nyu.edu/research/?page_id=34

 

Nice Article, Mike. Thanks for mentioning Pluggd.

Pluggd’s HearHere automagically does deep tagging on audio and video by combining speech recognition and sematic analysis. We plan the bubble up the most relevent keyword concepts as tags so users can quickly access segements of audio/video without having to search (although they can if they want to).

I do believe publishers have an incentive, but it is simply too much work for them to keep up with. Besides, the way they tag things might not be the way other people think about things. There will likely be some combination where some publishers do a little work, many don’t, and there is some automation that cuts across both scenarios.

HearHere is only a tech preview right now, but you can play withit yourself on the pluggd web site. There is also a cool screencast.

http://www.pluggd.com/demo

We’d love some feedback. Thanks.

 

IMHO “Deep tagging” is simply the latest term for an effort that has been underway for at least 25 years by knowledge engineers, linguists and information scientists to find useful ways of indexing, and discovering the contents of “opaque” information… such as video. A plethora of technologies have been developed to partially address the “discoverability” problem: speech-to-text; natural language processing, facial and scene recognition, motion and fractal recognition, semantic entity extraction… Not surprising is the fact that in this day of heightened security concerns from terrorism, it turns out that many of the leading technologies to do this were initially funded by folks like DARPA and developed for the Intelligence Community. Nothing here is particularly “new.”

What is new is that these technologies are beginning to make in out of Langly (CIA) and Ft. Mead (NSA) and be combined in new and practical ways in web applications. Many of these technologies are beginning to make their way into commercial products and whatever limitations they have are being mitigated by supplementing human intervention.

For example, at my company, Voxant, we are using state of the art speech recognition to create transcripts. We supplement these transcripts with advanced entity recognition software to identify people, places, and important noun-phrases. We add to this, facial recognition to identify speakers, and are working on fractal scene recognition to begin to be able to describe “what is happening” in a piece of video. These are reviewed by humans, corrected and fed back into the system to create a “feed-back-loop” that makes our engine smarter. The result is a searchable index makes the otherwise “opaque” video increasingly discoverable and transparent.

Even more exciting is this: Over the next few years these new technologies will mainstream….. when combined with “user tagging” by viewers, the “wisdom of crowds” adds an even more valuable layer of intelligence. Later this year we will start enlisting our customers to help us do this, making the media more discoverable, relevant and in practical terms “usable” through the collective wisdom of new technologies and perceptive viewers.

 

We just launched Viddler.com and believe it’s the right step towards social timed tagging.

Search to points in time instantly!
Hyperlink those moments in time.
Share those moments as well.

We believe Tagging and Commenting at moments is essential.

Check out some fun footage:

http://www.viddler.com/explore...../videos/2/

 

Leave a Reply

Create a Gravatar for your comments.
« Back to text comment