It is normal for many environments to run applications in the cloud. For some environments, the new normal is running applications in multiple clouds, as well as running applications in their own data center or private cloud. If the new normal includes a data center plus multiple cloud environments, then we need to know how to protect applications running in those environments. Using different backup applications for the three different environments can get complicated, and complication equals risk.
It is this new world of multiple clouds and data centers that Datos IO claims to address with its latest release of RecoverX. If you’re not familiar with Datos IO, be sure to check out our previous briefing note on its product. A brief summary is that it is a data protection product designed for extreme scalability and cloud suitability. Its initial release targetes non-relational databases such as Cassandra and MongoDB, since there are not a lot of products going after that growth market.
With some time under its belt, RecoverX is now stretching its wings to protect other applications in the cloud and the data center. RecoverX 2.0 includes data protection and data mobility for relational databases. The product allows you to backup relational databases running in any cloud vendor and your data center, and then allows you to easily recover or migrate that application to any cloud vendor or your own data center.
Datos IO is quick to mention that a key element of its products is semantic deduplication. Most deduplication is block-based, and most backup dedupe is done by first unpacking the backup format, and then applying a block-based deduplication algorithm. The challenge with block-based deduplication is that it is very CPU intensive, so deduplication vendors must strike a balance between deduplication ratios and performance. The smaller the block size the deduplication algorithm looks at, the better the deduplication ratio will be. A 256K block size would render great deduplication ratios, but it would require the creation of 16 times more SHA256 hashes than the typical block size of 8K that is used by most deduplication products.
Semantic deduplication looks at the change vectors themselves, such as a single element in a single table being changed from 5 to 7. Storing that change with a semantic deduplication system might require a few bytes of storage. But if you made that change, backed it up, and then submitted it to a block-based deduplication system, it would store at least 8 KB. This is why Datos IO advertises a 10 X bidirectional move efficiency of semantic deduplication over the alternatives.
Both the data protection and data mobility features employ source-side application listeners that perform their deduplication before transmitting backups or migration jobs across the wire. Like the previous version of RecoverX, this product also uses a non-media server architecture; backups are sent straight to a scale out cluster running the RecoverX software. Datos IO announced that RecoverX 2.0 also includes support for increasing the size of the RecoverX cluster from 3 to 5 nodes. Datos IO designed the product to be scalable to many more nodes, but this is what it has tested and is supporting today.
Finally, RecoverX 2.0 includes data protection for big data file systems, such as Hadoop and HDFS. It supports backing these systems up in a similar way to how it handles large databases by using a listener that provides incremental-forever backups. The product also supports semantic deduplication, but currently it is only at the file level. Datos IO’s representatives did say that it is examining the possibility of looking for changes within a file the way RecoverX looks for changes within a database.
RecoverX is a very different data protection product written for a very different world than that existed before the cloud. It is unique in its support for the cloud without ignoring the data center. This unique combination of features that allows a customer to protect applications and scale-out file systems no matter where they reside, and recover or migrate those applications anywhere else they might reside is quite powerful.