By Mike Sparkes, Backup and Recovery Services Market Manager,
EMC Australia and New Zealand
In January, EMC announced 41 new products, four of which were Data Domain disk-based, deduplication enabled backup solutions. The fourth of these was the Data Domain Archiver which is a whole new category of product and a subject for discussion at a later date. The first three were newer, faster models in the Data Domain portfolio. It has become almost routine to announce that the latest releases are twice as fast and twice the capacity of the systems that came before but perhaps we are getting too blasé about this.
It was only May 2010 when we announced the previous DD880 and Global Deduplication Array (GDA) systems, the fastest ever. Now here we are, half a year later with DD860, DD890 and a new GDA, the fastest backup appliances in the industry, with the biggest offering performance more than seven times greater than its nearest competitor!
For many years, we at EMC have been evangelising the benefits of in-line deduplication. This approach does all the processing on incoming data before any of it is written to disk. The boundaries between the variable length blocks are determined, the blocks are given their unique fingerprints and then compared with the index of previously written blocks. Only if found to be unique is a new block finally committed to disk. This approach is heavily dependent on the speed of processing and memory for performance and is almost isolated from the underlying disk performance and this has allowed Data Domain systems’ performance to ramp up many times faster than disk array speeds have been improving.
The result is that the ‘in-line’ vs ‘post-processing’ debate is over. Data Domain’s in-line performance is so superior that there is no reason not to use it. In a backup and recovery environment, performance is king. Until Stephen Hawking can work out how to bend the space-time continuum at will, we are restricted to 24 hours in a day or even less in a conventional backup window. So the faster we can process backup data the more data we can process. The latest GDA can now put away up to 26.3 TB/hr. That’s over 210 Terabytes in an 8 hour window or 630 Terabytes in 24 hours. These are theoretical maxima and only achievable in ideal circumstances but you know what they say, “A few hundred terabytes here, a few hundred terabytes there and pretty soon you are talking about a lot of data”.