Object storage is an enabling technology, one that improves scalability and performance at scale over traditional storage architectures which use RAID and replication to protect data. Using erasure coding and data dispersion within an object storage system can also greatly increase the reliability of data while improving storage efficiency and reducing cost. But this combination of storage technologies can provide another significant advantage as well, one that’s especially important to cloud providers – data security.
Object storage is a data architecture that uses an indexed collection of data objects instead of a hierarchical file system in which to store data. Each object has a unique identifier (an object ID number) which is used to access the object via a flat-file index.This organization generates less metadata and greatly simplifies data access over the traditional file system retrieval process. The result is a data architecture that can be expanded far beyond what a typical file system can, while improving performance as it scales.
If the object storage system uses erasure coding these data objects are then parsed into multiple component blocks which are expanded with additional information to create a larger set of data blocks. This parsing and expansion process is somewhat like a RAID parity calculation in which a minimum number of these blocks are required to accurately recreate the original data object. For example, a 5 of 9 erasure coding scheme means that each data object is parsed and expanded into 9 segments, requiring only 5 for recreation.
When implemented in a clustered, modular storage system, these segments can be distributed (dispersed) across multiple physical storage nodes in the environment increasing resiliency even more by protecting against a localized outage. Companies like Cleversafe have evolved this technology combination of object storage, erasure coding and data dispersion into a storage system that provides greater scalability, better economics and is more reliable than traditional RAID and replication methods.
Security without Encryption
These systems can also improve data security. Object storage systems that leverage erasure coding and data dispersion provide an intrinsic level of security, due to the fact that multiple data segments have to be accessed in order for data to be reproduced. In the 5 of 9 scheme described earlier, this would be 5 segments. When those segments are dispersed to separate physical storage modules in different locations, it means an intruder would need to break into more than a single site.
In another scenario, if a storage module was disposed of improperly it wouldn’t, by itself, present any danger of a data integrity breach. When encryption is added to the technology mix, these object storage systems can provide an even greater level of security, one that’s significantly better than simply adding encryption to a traditional storage architecture.
Data security at the storage level – called “data at rest” is typically provided by an encryption process. In a traditional storage system this means an encryption key is generated which, when applied in a conversion process to a data volume, renders it essentially unreadable. That key is also used to decrypt data when it’s read back from the storage system. In this way an attacker would need to have both access to the stored data and access to the encryption key to compromise data security.
In enterprise storage systems encryption keys are kept very secure, which usually involves a dedicated system for key management. However, from the standpoint of data security, the encryption keys represent a single point of failure, so to speak, since one key is often applied to a large amount of data. If the keys are compromised, unauthorized access can result, but there are other issues with using encryption keys as well.
Keys can be lost or corrupted, which can mean data is also lost. Employees can leave the organization and take the keys with them, either intentionally or by oversight. A similar scenario would be renters who leave and don’t turn in their house keys, forcing the owner to change the locks just to make sure they’re secure. Like the landlord, data owners in this situation may never feel secure until their data’s been re-encrypted, a process which involves reprocessing the entire data set with a new key. Obviously, this can be nearly impossible with petabyte-sized data sets common in cloud environments.
What’s needed is a methodology that’s more secure than traditional encryption, plus eliminates the shortcomings of encryption keys and key management. By combining encryption with object storage and information dispersal companies are doing just that.
Object Storage with Encryption
Cleversafe, as an example, encrypts each data object with what they call an “All or Nothing Transform”, a process that requires all the component segments of a data set to be known before the data can be deciphered. It uses a randomly generated key which is actually encrypted and stored with the data itself. This encrypted ‘package’ is then run through a data dispersal algorithm and written to multiple physical nodes which are spread out around the environment – and often geographically distributed.
The result of this encryption-dispersal process is that multiple nodes must be compromised in order to get access to the data package described above; in a 5 of 9 erasure coding scheme, this means 5 separate nodes. Of course, if additional security is desired this number can be increased to 12 nodes, or 20, or more.
But even assuming an intruder is able to access the required number of nodes, they would also need to understand the information dispersal algorithm before they could reassemble the data package. And, they would need to know how to reverse the encryption data transform as well.
In comparison, compromising a traditional storage system that uses a typical data encryption process requires that only the storage system is accessed and the encryption key is known. And for traditional storage systems RAID and replication can cause another security issue as well because creating multiple replicated copies multiplies the “attack surface” at the same time.
By recording the encryption key and storing it with the data object, this system eliminates the entire key management process and all the potential problems it creates. With an essentially ‘keyless’ system, there are no keys to lose or leave the organization with a former employee; there’s no need to re-encrypt data and no reason to worry about data integrity if such a re-encryption process is impossible. And, since these internal keys are randomly generated, the system can create a new key for each data object, increasing security over systems that use a single key for larger data volumes. In fact, systems like Cleversafe actually split large objects into multiple segments and encrypt each segment with a different key.
Object storage is addressing many of the issues that extreme environments, like those supporting cloud-based businesses, are imposing on enterprise IT organizations. These object-based systems can be made more scalable, more efficient and more economical than traditional scale-up or even scale-out storage architectures. When combined with data dispersal and keyless encryption technologies, object storage systems can also be made more secure than traditional encryption and reduce the overhead and uncertainty associated with key management as well.
Cleversafe is a client of Storage Switzerland