Five Ways to Mix, Rip, and Mash Your Data
by Nick Gonzalez on March 2, 2007

Call them pipes, teqlos, dapps, modules, mashups or whatever else but fact is that recently we have seen a good number of new services that allow developers and users to build mini-apps and mashups that mix and re-mix data. Here we run through 5 applications that allow you to mix, rip and mash your data, looking at the data input, output, REST support, suggested use, and required skill level:

mashfeatcomp.png

Yahoo Pipes

pipes200.pngYahoo! Pipes is a GUI web app that lets you create new data feeds by remixing syndication feeds (RSS, Atom, RDF). Pipes takes in feeds from around the web, letting you sort, join, and analyze the feeds items before outputting them in RSS or JSON. It also has a good query builder module that lets you grab feeds based on URL parameters. Yahoo! has also created a community around the service, letting users publish and remix other people’s pipes. The resulting data from the pipes can even be used for other mashups, as Teqlo has done.

Ideal for:Pipes is best suited for mashups between well formed feed data with Yahoo! services such as Search, Local, Flickr, or even Google Base, since the modules are already included. Programming experience is limited to an understanding of procedural programming control structures (loops, logical tests) and aided by the visual interface.

Examples: Apartments near something (Craigslist and Yahoo! Local). eBay Price watch (eBay RSS API).

Teqlo

teqlologo.pngTeqlo is a new widget-based mashup application. You build mashups by dropping specialized widgets onto the canvas and specifying interactions between them. For instance, you can map the results of an eBay search by dropping an eBay search widget and Google map widget on the canvas. Then you connect the two widgets by specifying an interaction such as when an item is selected in the eBay widget, add a marker on the Google map. The application is then accessed by a webpage with the active AJAX widgets. Other widgets include Google Calendar, Gadgets, Spreadsheets, LinkedIn search, DabbleDB search, YouTube viewer, contact lists, and to do lists.

The service is currently in beta, so they have a limited number of modules and have not turned on publishing to the web yet.

Ideal for: Teqlo is a high level masher best suited for non-programmers. Users create interactions between widgets by specifying an action in one widget causing a reaction in another. However, Teqlo’s high level approach means most of its power lies with its developers ability to craft useful widgets and interactions.

Examples: Examples are not public yet, but an example Teqlo is covered in their blog.

Proto

protologo.pngProto is a Windows based mashup application meant to join your desktop apps with the web. You need the Windows application to both create and use the mashups. It’s component based, joining your desktop and web apps by pulling data from your desktop applications, such as Outlook, and feeding it into online web components, such as Yahoo! maps. Proto has the Visual Basic for Applications development environment (VBA IDE) and Adobe Flash baked in, so you can create your own modules to pull and display data from your applications. Proto also has a light database it uses to broken and manipulate data between the application and online component.

Ideal for: Proto takes some familiarity with database concepts and hopefully VBA experience so that you can program your own modules. Their 5 minute intro is indicative of the experience level you need to really use the program. Since Proto allows you to share your mashups, non-programmers can also use Proto for their library of pre-existing mashups.

Examples: The intro video provides a good example of the program, but downloading the viewer is needed to view modules like the restaurant viewer or more enterprise minded Salesforce reporter.

Dapper

dapperDapper is a web based application for generating XML for website content. You create “Dapps” (web services) by using Dapper’s virtual browser to grab content from web pages. Dapper is trained by feeding it several example urls that hold examples of content you’re interested in. Dapper looks at the similarities between the pages to take a guess at the important content to pull from the page. After Dapper has analyzed the page, you can narrow down the fields on the page you want to track. For instance, the titles of stories on Digg. Dapper can then output the content you select from the page in various formats (XML, JSON, HTML, and YAML) and incorporate that data to trigger alerts or even map locations found in the feed. Each Dapper application, “Dapp”, is published to the community for anyone to use.

Ideal for: Dapper takes minimal programming experience and is useful for making well structured feeds for pages that don’t have them already. Their demo movie is a good place to start.

Examples: Fidget is a tool that lets you find videos of your favorite bands based on searches carried out by Dapper.

OpenKapow

openkapowOpenKapow is the industrial strength version of Dapper. It’s a desktop app that programs RSS feeds, REST apps, and web clips through a browser interface. You can use OpenKapow to make a web robot to pull from a web page like Dapper, but can direct that bot to navigate web pages (including form submission), carry out loops, branches, recover from errors, and accept user input at any point in the process. OpenKapow has a community where developers can share their robots to be used and remixed by other users.

Ideal for: OpenKapow is ideal for serious web scrapping. It takes basic knowledge of procedural programming and web markup to use.

Examples: Here is a robot that logs into Gmail and outputs your email in XML. Here’s another one that searches deep into TechCrunch’s posts for a keyword.

Comments

nice writeup Nick! - I wonder why any of these services will be consumer-friendly.

 

I am curious to see where this stuff leads to. Surely at the moment this concept is in a very playmode meant primarily for geek users.

-Zaid

 

agreed! very well-written and informative post… nice work Nick!

that OpenKapow is very impressive… i’m going to have to use it in some of my apps… being able to submit forms and analyze data will come in very handy and is above my current elementary capabilities as a programmer… can’t believe i wasn’t aware of it sooner… could have used some of its functionalities on a project i just spent way too long working on… oh well… at least I know for the next one…

 

Damn that’s swell timing…

How’d you know I was just looking for this mash up summary?

See ya’ - - I’m gonna mess with this stuff and see if my stupid little idea amounts to anything.

 

These are great! Thanks for covering them. Teqlo and Proto look like pretty cool apps from the examples you provided.

 

Great breakdown Nick!

 

Shouldn’t they have called it Tubes rather than Pipes? I heard the internet was a series of tubes.

 

Hi Nick,

Thanks for including Teqlo in the write-up.

My name is Rod Boothby, and I work at Teqlo. Just to let you know, we will adding a series of features to let people add more widgets into the system. There will be facilities for both highly capable programmers and more typical end users.

You might also be interested in checking out our new RSS widgets. Right now, you can just build a simple RSS feed reader. However, we are going to add a series of fun follow on widgets that will turn the basic RSS reader into something really powerful.

Thanks again,

Rod

 

Nick:

This is good piece of work. I am the lead investor in Teqlo. According to your story lead, I suppose I’d have to be with a last name like “Rip”. ;-)

One interesting side note is that these are all “technologies” or perhaps a couple are tools businesses. The ultimate commercial uses of these intiatives will probably be clearer in a few quarters. As such, several are actually complementary technologies. For example, at Teqlo we currently use OpenKapow with one of our demos and probably could use Pipes and Dapper, too. These are all just bricks in the road toward making the web a platform.

Thanks again for the notice.

Peter Rip

 

Nick,
Thanks for the post. Considering that Peter and Rod already chimed in, I guess I should add my .02 cents. :)

The really exciting thing about these 5 companies is that we are all able to layer on to each other, as opposed to it being an either/or. We are already using OpenKapow and we have also developed a Pipes widget that we want to package up for our users. I have talked with Proto and see a really valuable scenario where they componentize desktop apps that then mashup with our web service components.

As a collection of companies we are paving a path, to extend Peter’s metaphor, where end users can program the web to fit their needs.

 

Hi Nick,

This is a great summary, helpful and puts each of these web remixers into its own category. I have not played with OpenKapow yet, will do it now.

Alex

 

These are all cool new services. At Feed Digest we have been doing a subset of this (although certainly not to the breadth of, say, Pipes) for almost two years now, with 25,000 users (many paying, and most non tech savvy). So, to those who ask if it’s possible to be user friendly and still offer powerful mixing features.. totally! Don’t stop trying :)

However, we certainly don’t mind not being in these listings (and my request to not be profiled still stands) as we’re not looking for more users at this time due to big changes in direction and target market brewing right now. But I’ll be in touch when the really exciting stuff kicks off in a couple of months’ time!

As an aside, I would also like to recommend the fine FeedRinse (http://feedrinse.com/) by Electric Pulp. They only do a subset of what we do, but they do it extremely well, and are a great service to use if you want to mix feeds together. I notice they recently well all free too.. so nothing to pay (unlike with us!)

 

In addition to XML, Pipes also produces JSON output, using _render=json, and an optional callback wrapper, _callback=foo. What this means is that any RSS feed can be quickly converted to JSON and rendered directly into the browser window without requiring a proxy. Suddenly, everything is mashable.

Here’s an example for you: TechCrunch, rendered by Pipes and some Javascript into a portable little badge.

 

Rod, probably getting too specific but I would love to see something that takes a rss feed (from sites like designspone or ebay) and turns it into a grid view of images from each rss entry. Much easier than scrolling through an image heavy web page! Google reader has grid view for a few of their products not their google reader. Slide has a nice tool but not exactly what I am looking for.

 

Thanks for a great article. I was playing around with Pipes and Dapper the other day and wondered how some of these apps could be used together. And — voila — here’s the lowdown on Techcrunch.

 

Ventureblogalist: I imagine other people do it, but FeedDigest will do what you’re saying with feeds that have images associated with each post separately (or as enclosures). This works with Flickr and many Yahoo feeds, for example. A very quick example would be this digest of a Flickr search for ‘yellow’ - http://app.feeddigest.com/digest3/EEBX8FJXXE.html .. But since you can recode the template however you want, it can look however you want and process whatever feeds you want too.

 

Hey Nick, thanks for the roundup. It really is good times for this market segment, and I think all of us are paving the way to a new, more empowering kind of Web. We see it every day, as our users go and use the thousands of available dapps to realize ideas that are just amazing in their creativity. Recently, we were happy to co-sponsor a contest with Proto for building apps with Proto/Dapper, which resulted with some very cool apps.

In addition, big businesses take notice and start understanding the potential of having any web source as a database to be leveraged. From the side of content providers, these developments are very disruptive as they have the potential to significantly alter their business model, and at the same time provide them with a reach to new audiences that they’ve never tapped before.

BTW, those of you who use RSS and Firefox would likely be very interested in an upcoming release we’ll be shipping in a few days..

 

Great introduction to these tools! I have been reading up on them but haven’t come around to actually using them. Maybe, now is the time to start (link.

 

Orchestr8 AlchemyPoint is another mashup development platform currently in development, probably most similar to OpenKapow. Users can mash up web content with RSS/ATOM feeds and enterprise data sources (SOAP, etc.) There’s a neat video here that shows users mashing up Internet content with AlchemyPoint. Neat stuff.

 

Orchestr8 AlchemyPoint is another mashup development platform currently in development, probably most similar to OpenKapow. Users can mash up web content with RSS/ATOM feeds and enterprise data sources (SOAP, etc.) There’s a neat video here that shows users mashing up Internet content with AlchemyPoint. Neat stuff.

 

Nick:

I was actually dissappointed with this article, especially with the lack of indepth information. This was a very basic summary comparison and to be honest I got more information from the comments. In fact, I read TechCrunch.com more for its commetns recently than its articles.

I would also appreciate a basic non disclosure statement from the author and one from techcrunch indicating any conflicts of intererst, given TechCrunch.com’s reputation.

Back to the article, I know this is a startup blog with more than one author, but you forgot to include 10 other startups that are in beta working on feed scraping or data mashing.

You could have also posted a brief summary of feed scraping or data mashing history on how the concept of data mashing has existed for years already with the enterprises. The enterprises would create their own custom data mashups, specifically financial related companies.

Granted most of these new feed scraping, data mashing, etc websites are consumer oriented, but you could have gone the distance for more information.

For example, openkapow.com has been in this business for years now and they have really simplified the process of data mashing for enterprises or larger companies. The RoboSuite by Kapow technologies is six figures, while openkapow.com is free for everyone to see your work and with limited controls. Rumor has it that Kapow Technologies is working on additional consumer oriented data mashing prodcuts for the general consumer.

You could have also mentioned some of the obstacles in this industry like data copyright violations, hence the scraping.

I am still looking for a really good general consumer orietend data mashing/feed scraping product, so if anyone has come across one that is not going to cost me a kidney, please post ……

Nick keep up the good work and you are really the only author I read on TechCrunch.com.

 

ditto everyone else — this type of analysis (rather than simple reporting or regurgitation) is what the blogosphere needs more of (lo and behold! a blog post with a summary table!). good work, Nick.

 

Nick -
Nicely done. It is great to see so much interest in “re-mixing” data and services.

I’m the CEO at Proto, and we think in the past year or so that the pieces of the mashup puzzle have fallen into place to put fast, practical, commercial mashup building squarely within the reach of power users. The availability of data and services as well as tools to create new services and feeds are outstanding.

Thanks for the post,
Byron

 

As everyone else has said, great story…

For several months now I’ve been playing around with my own set of tools for manipulating RSS & Atom data and there’s certainly a lot that can be done there. The question that’s been burning in the back of my mind though is “just how legal is it to re-mix data from a source?” Most sources have pretty stiff Terms Of Use statements that it seems would be violated by a lot of these re-mixers.

The Yahoo! Pipes project really suprised me when I saw it because it seems like they directly violate their own TOU statement. But I guess if it’s your data you can do whatever you want with it.

In future posts, I’d love to see more investigation into the legality of all of this.

-Steve

 

Note that the first link to Proto should be to http://www.protosw.com/ not http://www.proto.com/

Yes, we do actually have content on the front page of our website.

 

Thanks for the nice write up on openkapow. You will see further improvements coming to openkapow going forward. Enjoy.

 

Nick.

Thanks for putting together a very useful summary. This kind of summary does a great service in communicating the value proposition of these emerging applications.

Though its in an early stage but Ajuby (http://ajuby.openapp.org/ ) is positioned as an open source mash-up builder. Though not as sexy as some of the examples you mentioned but some of our current users are using this in an enterprise IT integration context (old-fashion mashups :) ).

Thanks again for posting this summary. I am sure this list will grow with more interesting ideas.

Brij

 

Hi Nick,

Since Kent Brewster mentioned converting RSS feeds into JSON and rendering directly into a browser, how about an article on JSON? It seems like the future perfect language for these apps. I’m sudying Chinese in Hong Kong right now and have been bowled over by the number of Chinese programmers who’re learning JSON.

Douglas Crockford, the off-the-charts genius who wrote JSON, is a Chinese chess enthusiast. He has a very funny site with dual text explainers in English and Chinese.

http://www.crockford.com/

 

Who supports microformats on a HTML page?

 

re: feedDigest

I tried an ebay feed (sent exact rss to your customer support) and I got an error message.

 

Great! I’d been looking for a way of mixing RSS feeds for ages npw. I used Yahoo Pipes over the weekend, it looks rather powerful. I also like Ning.com because that lets you pipe together feeds and produce a kind of MySpace page so to speak.

 

You guys should know this new product Nenest (http://www.nenest.com), which makes it easy to host your data online, then your visitors are able to get RSS feeds, Digg, del.icio.us, Sphere, etc. You do not need do programming and database design. Also you can import an existing database on the fly.

 

Isn’t ponyfish.com missing here? Its a great tool to create and filter RSS feeds.

 

Nobody talks about ‘macro.scopia’, http://macro.scopia.es, ?? I think it’s a great web mash-up and not appears! :! You can create tree-maps or gmaps by gcalendars, feeds, or gspreadsheet, features I not see in the apps you mentioned…

Examples:
http://macro.scopia.es/html/cpanel/exec.html?230
And the way it was made
http://macro.scopia.es/html/cp.....w.html?230

 

The site is really slow right now, especially with the player constantly trying to load itself.http://www.allvideotools.com

 

John Doe wrote: “I am still looking for a really good general consumer orietend data mashing/feed scraping product, so if anyone has come across one that is not going to cost me a kidney, please post ……” => Have a look at http://www.iopus.com/imacros/ and http://forum.iopus.com/

iMacros Scripting Edition offers full-blown visual website extraction features. What makes iMacros different is the Scripting Interface. It allows the integration with any Windows scripting or programming language. This makes it very powerful. Also, it works with more AJAX websites than any of the tools in this review. It even works with Flash- or Java applets. IMHO iMacros can easily compete with Dapper and Kapow for a tiny fraction of the price. And their support is very responsive!

 

#7 Josh, There is already an application called Tubes and the name is trademarked. So obviously they were beaten to the punch. http://www.tubesnow.com

 

Leave a Reply

Create a Gravatar for your comments.
« Back to text comment