Because of the capacity requirements associated with it, the cloud storage use case is the most difficult to justify for enterprises. Paying for petabytes of capacity on a recurring basis can get very expensive, especially for backup and archive data that will sit inactive for most of its life cycle. While cloud providers are continually trying to drive down the raw cost of cloud storage, the cloud seems like a perfect place for data efficiency technologies like deduplication. StorReduce is one of the first companies to offer a “bring your own” storage efficiency to the cloud.
The Cloud Conundrum
The cloud compute use case is temporary. An enterprise can scale up or down the number of processors required for an application as needed. The cloud storage use case is more permanent. Data essentially has to stay in the cloud or constantly be moved back and forth. But you are charged for storage whether the data is in use or not. And, again, in the case of backup and archive, moving the data to the cloud on a permanent basis and not accessing it is the point. Obviously there is value in its use. Cloud storage creates an off-site repository and limits the growth of on-premises storage. But it does this at a much higher cost than renting processors. Capacity can’t be rented, it’s more like a mortgage.
The StorReduce Solution
StorReduce is software that is installed as a virtual machine in the cloud of the organization’s choice. It deduplicates data blocks inline as they are being written to storage. Depending on the level of redundancy within a data set it can reduce storage by as much as 95%. It has cloud interfaces like OpenStack Swift and S3 and a single instance can manage up to 10PB of storage.
Backup Migration to The Cloud
While most backup solutions have the ability to replicate a copy of their backups to the cloud, many do not have the ability to maintain a deduplicated copy in the cloud. It must be fully inflated when stored or the deduplication index has to be stored locally, meaning that it would have to be recovered in the case of a disaster. With StorReduce, data can be sent to the cloud and stored in a deduplicated state, permanently. While the rate of reduction will depend on the data type and the use case, StorReduce expects that an average customer will experience a 95% data efficiency rate.
Making Cloud Mirroring Practical
Other than the obvious reduction of cloud storage cost, StorReduce can also enable a much more efficient synchronization of data between cloud providers since the solution can run in multiple clouds at the same time. The two instances of StorReduce could communicate with each other and make sure that only net new data is transferred to each site. Again, since many providers charge based on data transferred, as well as capacity consumed, this reduction could represent a significant savings and finally make cloud mirroring feasible.
Low Cost Clones
Another use case is creating low cost clones of data for Dev & Test. A copy of the main data can be created but with no increase in capacity consumption, initially, since it’s all redundant. Then, as the test and development teams make changes, those changes will be stored as unique data.
While it’s early on for StorReduce, the company is reporting some compelling early customer results. The initial use cases, backup, archive and clones, are all operating on secondary copies, so experimenting with the solution seems safe and of course the potential payback is significant.
The key for StorReduce is to fine tune the software to scale index operations (currently they claim 250k operation per second on a single AWS instance), to provide complete high availability of the deduplication VM (on the roadmap) and to provide linear scale-out capabilities (also on the roadmap).