Products Service Company
 
 

 
  Overview
 
Why do I need a PDP?
 
  What's new
 
  Download
 
  Pricing / Register
 
  What issues does the Private Data Protector solve?  
 

 

Look at the article below to understand which problems our product solves:

You can find original here: http://www.schneier.com/crypto-gram-0308.html

Hidden Text in Computer Documents

In the beginning, computer text files were filled with weird formatting commands. (Anyone remember WordStar's dot commands?) Then we had WYSIWYG: What You See Is What You Get. Or, more accurately, what you see on the screen is what you get on the printer. In the beginning, what you saw on the screen what was what was actually in the digital file. With WYSIWYG, what you saw on the screen was not in the digital file; formatting commands remained hidden from view, and the screen looked like the printed page.

WYSIWYG was an huge improvement, because it enabled writers to more easily format documents and see the results of that formatting. But it also brought with it a new security vulnerability: the leakage of information not shown on the screen (or on the printed document). Most of the time it's completely benign formatting information, but sometimes it's actual text. And because the user sees what the printed page looks like, he never even knows that this text is in the file. But someone who is even a little bit clever can recover the text, with embarrassing or even damaging results.

Three examples:

Last month, Alastair Campbell, Tony Blair's Director of Communications and Strategy, was in the hot seat in British Parliament hearings explaining what roles four of his employees played in the creation of a plagiarized dossier on Iraq that the UK government published in February 2003. The names of these four employees were found hidden inside of a Microsoft Word file of the dossier, which was posted on the 10 Downing Street Web site for the press. The "dodgy dossier," as it became known in the British press, raised serious questions about the quality of British intelligence before the second Iraq war.

Last year, during the manhunt for the DC sniper, a letter was left for the police by the sniper that included specific names and telephone numbers. Perhaps in order to persuade the panicking public that the police were in fact doing something, they allowed the letter to be published -- in redacted form -- on the Washington Post's Web site. Unfortunately, they implemented the redactions by the completely pointless method of placing black rectangles over the sensitive text in the PDF. A simple script was able to remove these boxes and recover the full PDF.

And three years ago in Crypto-Gram, I told the story of a CIA document that the New York Times redacted and posted as a PDF on its Web site. The document concerned an old Iranian plot, and contained the names of the conspirators. The New York Times redacted the document in the same reversible way that the Washington Post did.

So much for examples. How pervasive is this problem? In a recent research paper, S.D. Byers went out on the Internet to see what sorts of hidden information he could find. He concentrated on Microsoft Word, because Word documents are notorious for containing private information that people would sometimes rather not share. This information includes people who wrote or edited the document (as Blair's government discovered), information about the computers and networks and printers involved in the document, text that had been deleted from the document at some prior time, and in some cases text from completely unrelated documents.

Byers collected 100,000 MS Word documents, at random, from the Web. He built three scripts to look for hidden text, and found it in all documents. Most of it was uninteresting -- the name of the author -- but sometimes it was very interesting. His conclusion was that this problem is pervasive.

MS Word was the subject of Byers's paper, but other data files can leak private information: Excel, PowerPoint, PDF, PostScript, etc. There's no excuse for the companies that own those formats not to create a program that scrubs hidden information from these files. And certainly there's a business opportunity for some third party to create such a scrubber program, but they should be outside the U.S., because it might be a violation of the DMCA to do it. Microsoft's closed proprietary file formats make it harder to write such a scrubber, and unless Microsoft makes some additional changes in its software (e.g. usage and default values), scrubbers will remain an imperfect solution.

Oh, and the press uses techniques like this to unredact stuff all the time. I believe they don't mention it much because they're afraid they'll lose access to all that leaked information.

by Bruce Schneier
Founder and CTO
Counterpane Internet Security, Inc.
schneier@counterpane.com
<http://www.counterpane.com>

 


                 Copyright © 2002 - 2006, SautinSoft. All rights reserved.