Cutting The Digital FatCutting The Digital Fat

Unnecessary copies and older versions of documents clog up employee hard drives and make discovery exercises longer and more expensive. One vendor's software aims to help companies get smarter -- and more aggressive -- about deleting digital fat.

Andrew Conry Murray, Director of Content & Community, Interop

November 10, 2008

4 Min Read
information logo in a gray background | information

Unnecessary copies and older versions of documents clog up employee hard drives and make discovery exercises longer and more expensive. One vendor's software aims to help companies get smarter -- and more aggressive -- about deleting digital fat.E-discovery forces companies to re-examine their retention and disposition practices. Companies that don't get rid of content as soon as possible will spend more time and money sorting through piles of information, much of it irrelevant, than companies with vigorous disposition policies and processes.

Of course, implementing a vigorous disposition regime is a challenge. A company called NextPage has an interesting approach to that challenge.

NextPage tackles employee hard drives, shared drives, and SharePoint. These areas are often filled with unnecessary copies of existing files, including older versions of finished documents. All these copies and versions add to the pile of information that has to be searched, both by software tools and investigators, in a discovery exercise.

The company sells software to help companies make it easier to identify the final versions that need to be preserved while eliminating older versions and duplicates. The software consists of agents that reside on employees' machines, as well as server software that lets administrators set policies, monitor documents, and take actions, such as deleting files or saving the most recent version to a different repository, such as a content management system.

Here's how it works. The NextPage agent tags new Office documents (Word, Excel, and PowerPoint) as they are created by employees. NextPage calls this tag a "digital thread." It's a metadata stamp that includes a unique identifier, and it remains with the document from creation to deletion. If it can't tag a document, it creates a unique hash value for the document instead.

Once a document is tagged, NextPage can follow the document through its life cycle as it is edited, shared, renamed, and so on. It also can see when a file is attached to an e-mail.

If Employee A e-mails a tagged document to Employee B, and they both have the NextPage agent, the system knows a copy resides on Employee B's hard drive, and will subsequently track any changes that Employee B makes to that file while also associating it with the original file from Employee A.

This is the "thread" that follows the document throughout the enterprise. If a tagged file is sent to a user without the agent, the software notes that it was sent, but won't be able to follow additional changes to the document. However, if that document then comes back to the original sender, the system will pick up where it left off.

People Problems
I think the technology seems fairly straightforward. More problematic is how the system would actually be used in an enterprise.

NextPage says its customers tend to have humans make decisions about the final disposition of content. That's not a surprise -- lots of companies aren't comfortable with automated deletion of content. Human involvement may take the form of a list of older files that gets presented to a user with instructions for getting rid of those files.

But human involvement comes at a price. Employees are reluctant to get rid of older data, regardless of how infrequently they may access it once a project is completed. This means project managers and/or records managers will have to invest time and effort getting users to actually pull the trigger, and following up to make sure users haven't ignored or attempted to subvert disposition requests. The NextPage software can tell if users have saved copies of tagged files to removable media, including disks and thumb drives.

As for implementation, NextPage says its customers typically come from the CIO, CSO, or general counsel's office. These offices generally have the clout to drive a new policy in the organization, but as mentioned, enterprises shouldn't be surprised to find it takes some effort to get users accustomed to getting rid of their files.

Another potential issue is the sheer complexity involved in document tracking itself. As documents get passed around among collaborators and iterations pile up, things will get ugly fast, and the digital threads risk becoming a tangled ball of string. Potential customers should be sure they are comfortable with the management interface for tracking documents.

However, for companies that face multiple lawsuits every year, there's real value in reducing the content haystacks that must be searched during discovery. And while retention and disposition are more easily managed in free-standing content repositories such as e-mail archives and content management systems, there aren't a lot of good options in user land. NextPage is worth a look.

Read more about:

20082008

About the Author

Andrew Conry Murray

Director of Content & Community, Interop

Drew is formerly editor of Network Computing and currently director of content and community for Interop.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights