Data Domain's DD120 Brings De-Duping Down To Branch OfficesData Domain's DD120 Brings De-Duping Down To Branch Offices

In the four short years since Data Domain introduced their original DD240 appliance, hardware data de-duplication in the data center has evolved from interesting technology to accepted, if not yet standard, practice. While big enterprise data centers with petabytes of data and hundreds of terabytes of nightly backups are still more interested in raw speed than storage efficiency, most of us could improve our backup infrastructure significantly with de-duplication. With the new DD120, Data Domai

Howard Marks, Network Computing Blogger

February 19, 2008

3 Min Read
information logo in a gray background | information

In the four short years since Data Domain introduced their original DD240 appliance, hardware data de-duplication in the data center has evolved from interesting technology to accepted, if not yet standard, practice. While big enterprise data centers with petabytes of data and hundreds of terabytes of nightly backups are still more interested in raw speed than storage efficiency, most of us could improve our backup infrastructure significantly with de-duplication. With the new DD120, Data Domain brings the cost of data de-duplication down to the point where it makes sense for branch offices.Using a replicating backup appliance at remote offices lets you use your existing backup software to view and manage backups across the organization and automatically replicate the backups to the data center. They minimize bandwidth usage by de-duplicating the data before replicating it and, unlike some remote backup solutions, they also provide a local copy of the data for fast restores.

The 1u DD120 uses three 250-GB drives to deliver about 373 GB of space for backups before de-duplication. While de-duplication ratios vary by data type, backup schedule, and phase of the moon, users running a weekly full, daily incremental schedule should see the DD120 hold more than 5 TB of backup data. Since Data Domian's software de-dupes in real time, no disk is "wasted" holding backup data while the appliance de-dupes. Data Domain claims the DD210 can ingest data at 150 GB an hour (using CIFS NFS or NetBackup's OST protocol), which means you should be able to run 1 TB or more of backups in an eight-hour window even with usual fudge factor to allow for vendor hyperbole.

In addition to de-duping data locally, Data Domain also avoids sending data across the WAN that duplicates data already backed up at another remote office. When a DD120 is replicating to one of its Data Domain big brothers at headquarters, it first sends the hashes for its new data. The box at headquarters then sends back a list of blocks it hasn't seen so they can be sent over the line. So the 50 GB sales literature folder that's on every remote office's file server only gets sent across the line once. You also can schedule and throttle replication traffic.

I must say I was disappointed that Data Domain decided to use 250 GB drives in the DD120. New nearline 500 GB drives are just $70 each more than their 250 GB equivalents at NewEgg. Assuming the usual 3-4 to 1 manufacturer's markup, doubling the DD120's capacity would mean an additional $1,000 or so on the MSRP. I for one would rather pay $13,000 for an 800 GB appliance than $12,000 for a 373 GB one. After all, some remote offices, like say an insurance adjuster taking digital photos, could have a lot of data even if there isn't a high rate of change.

Now that Data Domain has tuned the new version of its software for de-duping small files, that insurance adjuster could save those photos to the Data Domain appliance directly, storing, de-duping, and replicating the data to headquarters all in one fell swoop.

Replicating de-duped backups is one good way to protect data at remote sites and the DD120 makes it affordable for many.

Read more about:

20082008

About the Author

Howard Marks

Network Computing Blogger

Howard Marks is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.

He has been a frequent contributor to Network Computing and information since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of Networking Windows and co-author of Windows NT Unleashed (Sams).

He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.  You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights