You need more than a network share and deduplication to stand out among the disk backup crowd in 2017. Things have come a long way since disk backup targets started appearing 17 years ago. The table stakes may still be the same, but a few upstarts are throwing a few bets down on the table that could have a few vendors folding their cards, if they don’t watch it.
In the beginning of the disk backup craze, there were a few things a vendor needed to compete in that space. First, a backup target needed to pretend to be tape (i.e. virtual tape libraries, or VTLs). All backup software products knew how to backup to tape, but some did not know how to backup to a file system. Even worse, products that knew how to backup to a file system still behaved like they were backing up to tape. They offered no special functionality to backup to disk.
Today’s backup market couldn’t be more different. The focus now is on file and object interfaces, because these interfaces allow backup products to take advantage of the fact that their backups are being stored on disk. For example, two important backup and recovery features that have come out in the last few years are item level recovery from a structured backup and, instant recovery of a virtual machine. Both of these features require random-access disk to be feasible; therefore, the VTL interface became much less interesting.
But disk backup targets have to do more than simply present themselves as a file or object system to work well with item level and instant recovery features; they must optimize for random-access. Early disk backup targets were optimized for sequential access. Specifically, they had to be very good at storing and deduplicating large streams of data during a backup and good at returning those large streams of data during a restore. However, now many backup products are designed for disk or have taken disk into account in their designs. They no longer create large streams of data during a backup, making the ability to handle large streams of data less important. Since handling large streams is less important and the latest features need strong random access support, products trying to compete in today’s disk backup marketplace need to be optimized for random access.
Disk targets that were designed over 10 years ago also did not envision the popularity of cloud storage. Many customers are now looking to the cloud for offsite and long-term storage of disk images; therefore, products that want to compete in today’s marketplace need to integrate these features. Customers should be able to backup to a disk target and then have those backups copied immediately or eventually to the cloud, especially inexpensive cloud storage solutions like those from Amazon and Google.
Disk targets also need to offer more than just “backup storage.” Some customers want to use disk storage for long-term storage of reference data, so today’s disk products also need to offer an object interface such as S3. This allows customers to use their disk target for more than just backups, but it also places a new burden on the storage to understand a little bit more about the data being stored on it. Products that offer integrated search can become a truly valuable part of the secondary storage infrastructure.
Disk backup and archive targets are finally being allowed to truly be disk. Not only do they no longer have to pretend to be tape, they have to be better at being disk, which means optimizing for random access. Unfortunately there are a number of products that probably need to be re-architected to fit into this new paradigm. Disk products that were designed from the beginning to behave like disk probably have an advantage.
Sponsored by Cloudian