One of the things that makes me very happy: receiving a confidential document in my email from a trusted source. It could be a merger agreement. Or an internal executive-only email memo. Or a powerpoint pitch for a new product. Whatever it is, if the people who created it don’t want the world to see it, there’s a good chance it will make a great post.
For years lawyers and accountants have said that they specially mark each document sent to individuals, making it unique and trackable. If a document gets forwarded outside of the group, a quick analysis of the unique marking can tell you which individual forwarded it. In theory its a great system, but it is a manual process, and extremely time consuming to coordinate and keep track of all the changes. Because of the difficulty in making and tracking unique documents, most of the time that someone said documents were unique they actually weren’t; people relied on the fear factor and hoped the docs would be kept confidential.
Bad news (for me): creating and tracking these documents now just got a whole lot easier, and forwarding them more dangerous. New startup Orbious will make the entire process as easy as hitting a couple of keys via a downloaded client for word processing and email applications. It uses a thesaurus to create unique versions of the document fore every recipient, and then keeps track of who got what. Here’s an example:
Consider the following sentence taken from a confidential circular that was distributed to staff during a very public and aggressive company takeover (a hypothetical but plausible situation):
Salaries will be paid as usual on the 15 of September.
Using DocTracker, the following alternative sentences would be generated to form part of the “signature” for distribution of this circular:
Salaries will remain paid as usual on the 15 of September.
Salaries will be paid like usual on the 15 of September.
Salaries will be paid as expected on the 15 of September.
Salaries will be paid as normal on the 15 of September.With only the above information, mappings could be formed for up to 5 recipients and the circular is then able to be tracked indefinitely. The mapping process remains confidential (i.e. the recipient need not know that his/her document was modified).
Orbious is in private beta, sign up on the home page to get access.





Michael,
If the objective is to forward the email notice on to others without having somebody know that it was you who forwarded it, why couldn’t the second sender simply use Orbious to change it again?
Hey, I’d be grateful if you would take a look at my living textbook: http://www.pass-ed.com/Living-Textbook.html
blah.
first problem: It depends on the copy/paste of large enough chunks of the document to include the modified words.
If the system is actually used widely enough to be a threat then careful recipients will be sure to relate the info “in their own words” - which appears to be what happens in most cases anyway.
second problem: documents that are important are often approved by (or written by) lawyers. Adding or changing words would be potentially disastrous.
Justin makes a great point about the legal ramifications of modifying company documents on the fly. I immediately thought of that myself.
Beyond that, imagine emails sent out that start containing strange wordings - where Orbius moved stuff around - and it ends up sounding like a memo from our friendly offshore customer service agent. =^) Could this actually create new barriers to effective communication?
Very interesting tool. Sounds like they have some issues to address, but perhaps they already have. I will check it out.
Actually this makes things very simple. Now instead of begging numerous contacts for data you just have to hack Orbious and you have a TON of data from dozens and dozens of organizations. Sweet!
No way I would let one of my companies use this - uploading confidential information to a third party company to ensure the security of the information? No, I think not.
I hope nobody has invested a ton of money in this…
Rob
The work around to getting tracked seemed so obvious that I checked the calendar to make sure it wasn’t April 1. I think the site will be of great use to sploggers, however.
I have a hard time seeing how having a number of “sound alike” versions of a document floating around is much of a solution. Anyone remember the comma that cost $1M (Canadian?) The contract was in both English and French, but the dispute hinged on the placement of a single comma in the English version. NYTimes: http://urltea.com/15b1. I think this would especially be a problem for legal documents.
Justin is right. you can paraphrase yourself and then send…
Aydin.
You are kidding, right? In the case of documents, why not simply include a steganographic signature in one of the images in the document?
In the case of office documents that dont have images, there is so much wastage of space (thanks to the OLE compound file format), that you can put digital tracking signatures in a lot of the free space inside the binary format of the document.
This is a simple enough coding feat to accomplish and is far more traceable than tweaking a bunch of words around..
It will be orbious to anyone which document you forwarded. Orbiously, running a spell checker might hide your identity.
http://www.shumabaobei.com
I don’t get it. It’s not Orbious to me!
Using the example listed, what would happen if the mail had to go to 20 people? are there 20 different versions? And what happens when the intended result becomes the unintended. Will this AU company stand up in a US court?
These 2 example sentences could mean completely different things:
Salaries will be paid like usual on the 15 of September.
Salaries will be paid as expected on the 15 of September.
Usual might mean usual but “expected” could mean something different between two people.
I can’t imagine ANY lawyer wanting to use this tool.
Just use the same service when forwarding it on. This would really disupt the tracking. There would be no way to know if certain words got replaced again or not.
Vijay,
You could just photoshop any image out of the document after scanning it. Paraphrasing is the ONLY way a system like this could work.
I don’t see any demand for anything like this. If you rely on a system like this to keep confidential documents confidential, you have a larger problem on your hands.
Justin is right.

Interesting comments. As a former corporate lawyer, I love it. This is the kind of thing law firms would pay a large licensing fee for, particularly if it became an industry standard.
What a dumb idea. And sometimes by changing words (even those with the same meaning), the gist could be lost.
Simply tattoo the document to a Bird of Paradise, the plumes and markings of which are unique to each individual bird. I’m going to call my app confibirdial.com and make ten hundred thousand dollarses.
Elegant and clever… But as the comments demonstrate, not clever enough. The only way to keep track of condidential documents still is Stasi-like access control, i.e. recording and monitoring all user activity on your network, including email and transfers to external drives such as USB flash drives.
With all the sarcastic comments, you would think TechCrunch was giving away more free Oomas. =^)
In defense of this badly harassed program, since Arrington was a former attorney, it is unlikely he would endorse it if he thought this solution was total crap.
My guess is that there are probably good responses for all the issues we have raised. But it is still fun watching everyone team up like this, even if it is to bash someone else. I feel very close to my TC buddies as we take turns hurling uneducated insults at this company’s startup.
I haven’t felt so much bonding since the 3D Max - Virtual Email program with the bikinis and sharks! =^)
Michael,
Sounds promising enough … at the very least, Jack Ryan would be, I am sure, a big fan of the model
http://en.wikipedia.org/wiki/Canary_trap
Now that’s a definite sign of good blogging, publicizing a service which could make it harder to get the information you need. I’m definitely impressed…
Besides the legal implications, you also have the risk of sounding like an idiot or at the very least being incredibly inconsistent.
Maybe it’s just me, but I wouldn’t trust an automated thesaurus to be able to craft messages coherently. Seriously, who takes so little stock in their writing skills that their willing to let a program hack it apart? And what happens when my recipients start to notice that over time the messages I send start to sound like they are coming from multiple authors? If anything, I would think that would compromise the sense of trust amongst the involved parties.
And what happens should those e-mails be subpoenaed?
And what about docs sent as PDFs?
For all those so concerned about the wording issue, I checked their website and apparently you have the option to authorize each wording change yourself. If you only make one wording change in the document that could be easy and effective.
Everyone here thinks they could avoid this issue, but you will probably not know they are using it until they bust you.
I am sure George Bush is one of their beta testers. I bet he is busily trying to figure out who in his cabinet is sabotaging his Presidency.
I wonder if Orbious can help him figure it out that it is himself?
i think this idea is pretty decent, but i wouldn’t rely on a thesaurus to do the tweaking. why not Capitalize random letteRs throughout the dOcument? or change the whitespace or punctuation in a slight but non-meaning-changing way? double space between some words or something…or even make the
lines wrap after a different column length.
there’s lots of ways to change the medium but maintain the message, thesaurus seems like one of the most complicated and easiest to spot.
Hi techcrunch
As an avid techcrunch reader, just wanted to say thanks Michael for the blurb.
For the readers (with valid criticisms) you can see my rebuttal comments at
http://www.orbious.com/faq.html
cheers
dave
Modifications seem a little rough.
@gilltots, thats basically what steganography is — hiding information in other information while keeping the perception of it the same. My point was that its easiest to do in images (if a couple of pixels is off by one shade of red, no one would be able to notice). It can also be done, as you suggested, by other seemingly innocuous modifications at the text level.
Orbiously, the thesaurus level stuff seems a little excessive to me.
This sort of technique has long been done with certain high-level documents, usually leaked by officials to journalists - it’s called “salting”. The typical way of dealing with it is to paraphrase the document and destroy the original.
Generally, if people are leaking, finding out who did it requires getting the original from the leakee, and they sure have an incentive not to help.
Some especially Machiavellian leakers have supposed exchanged their salted (paper) document with a rival’s, and then leaked the rival’s document! (or at least that’s a good story).
steganography doesn’t work when you print the document, orbious’s technique sounds like it will work when you print the document.
Microsoft has already solved this problem - http://www.microsoft.com/rms
I agree with Mike. This software’s popularity could take off very quickly.
If this company wanted to get some immediate sales traction they should contact every single Private Equity Firm and sell them the software. As recent history has shown there an incredible amount of insider trading happening on Wall Street.
….. btw - it looks like the Bush Administration may have been a beta customer. Nothing that comes out of the oval office has a consistent theme:)
You know what, Mike? This attitude makes me very SAD. You are…
1) Stiffling internal communications.
Even when executives want to be frank and open with the people they lead, it’s people like you that force them to be less-so.
2) Causing potential significant damage to companies’ bottom lines (and, by this, harming their employees)
By spilling the beans on mergers or new products, there’s a very real chance that you could be scuttling them, making them harder to pull off, and basically causing a lot of people a lot of pain.
Is being First worth it? Does inciting people to violate their contracts, make their colleagues uncomfortable, and deal with extra stress really make you happy? Does it make the world a better place? Or does it make you basically akin to Valleywag — anything for a headline and a buck?
Basically, you are encouraging people to violate their colleagues’ trust. Orbious may not be a great service, but you know what? It’s because of people like you that companies feel compelled to try this sort of thing and treat their employees like criminals and liars. Sure, if it weren’t TechCrunch, it’d be another site that malcontents might leak to. But that doesn’t give you clean hands. It’d be like saying, hey, if I don’t buy this stolen property, someone else will! I’m not the one who stole it!
Yeah. Just great.
How about reporting on stuff that’s public? Surely you can offer some insightful commentary, fact-checking, etc? Must you depend upon deceit?
I think the key point here is ‘from a trusted source’ - why wouldn’t the trusted source just say ’skype me for some great gossip about ‘ ?
Personally, I wouldn’t trust *anyone* who forwarded on confidential legal documents - would you want them to forward on *yours* ?
All opinions expressed are my own, etc, etc.
Wrote this out as an idea 6 months ago — could have been any of you
http://www.techquilashots.com/.....oyee-leak/
Interesting! However, changing text could be risky especially with legal and contractual documents.
Also, it is possible for a user to use same or similar software to make another unique version of the document before forwarding so that he or she can not be tracked.
This is a really old idea. Tom Clancy discussed it in 1988 in Patriot Games, and it wasn’t a new idea then. This is an interesting concept, but I think the flaws mentioned above reduce its usefulness - somebody would have to paste one of the sentences that was reworded, they’d have to paste it exactly, and the editing is non-trivial for juicy bits that are most likely to be pasted. It’s hard to reword “Q62007 EBITDA: $14.4M”.
I wonder: have they done a patent search on the concept? It would be interesting to know if they violate any.
It wouldn’t surprise me to see good uptake of this service as marketing folks look for solutions. It also wouldn’t surprise me to see some egg-on-face oopses, a-la ‘confidential’ data being leaked because somebody thought blacking it out in a DOC actually hid the information.
This has been done in politics and government for ages. Surprised it’s taken so long to become “software-ized”…
Wouldn’t it be easier to just use a secure 3rd party hosting site like draftspace.com (admittedly shameless self plug)? I know some of our clients (funds of various sorts) use our secure data site services to report to their LPs about returns and other confidential materials that they dont want being forwarded around, and we offer for them to also use the same secure platform to have their portfolio companies report to them.
The biggest concern with confidentiality isn’t being able to prosecute who broke confidence and leaked the document, its about preventing the leak in the first place. Orbious doesn’t seem to provide any real protection against that, just a cute way to catch the offender after the fact.
Ask yahoo whether they care who leaked their original facebook presentation. Maybe, but i’m sure they’d much rather it have never happened in the first place.
The doc tracker seems to be a very interesting idea, with a potential to benefit government and economic sectors.
With respect to Jack’s comment, I am of the belief that if the doc tracker system was in place the number of leaked documents would be reduced because the leak would be traceable. Most people are not going to risk their reputation and possibly their careers if they know that their actions can be traced.
Obviously, if someone knows that their is a good chance that they are going to be identified and held accountable for their actions, they are going to think twice about leaking confidential documents in the first place.
Darn. I thought this would be something else. ISO standards delivered by some standards retailers as PDFs have each page individually serial numbered with a microprint number and barcode. Now, if there were a way to get a hash out of a PDF markup and serialize it, then store that data at a depository (in other words, the content isn’t transmitted, just a unique ID for the page); it would be darn near legally binding.
I’d like to second Jack Studer’s remarks. Using a hosted service (e.g., shameless self-promotion, http://www.brainloop.com) you can create a closed-loop environment that connects all authorized users and applies persistent protection capabilities to keep information under control at all times, at rest, in transit, and when it’s on users’ desktops. Regardless of whether those users in your company or your external auditors or management consultants.
As an additional benefit of the closed-loop architecture you will get an exhaustive audit trail documenting who saw which document version at what time, which is exactly what you need when things go wrong and forensic investigation has to zero in on the leaks.
I couldn’t understand some parts of this article o.us poetry, but I guess I just need to check some more resources regarding this, because it sounds interesting.