Cloud storage like Amazon S3 is scalable and economical, and it is also highly available. But one thing it is not is consistent. Lack of consistency is the primary reason many organizations struggle with cloud migration. Many production applications need consistency to ensure the accuracy of data. So IT professionals face a dilemma, they either have to be very careful about which applications they move to the cloud or be prepared to pay extra for consistent cloud storage like Amazon’s EBS.
The CAP theorem is important for IT planners to keep in mind as they create a cloud strategy. The CAP theorem states that a distributed storage system can only provide two of three key elements – Consistency, Availability and Partition tolerance. On-premises NAS and high-performance cloud storage like EBS, for example, provide excellent consistency and partition tolerance but not multi-region availability. There is also a considerable cost premium for this level of service. Cloud storage like S3 provides excellent multi-region availability and partition tolerance while also being very cost efficient.
The reality of the CAP theorem means IT professionals need to understand the demands of their applications to see what type of cloud storage they will need and if that requirement makes the theoretical cloud cost advantages less attractive. Transaction heavy applications that need consistency will require a different type of storage than solutions that are not. Typically the more the application deals with structured data (databases), the more likely it is to require high levels of consistency from storage. The more a workload deals with unstructured data, the more appropriate it becomes for storage that provide a lower level of consistency.
A Way Around The CAP Theorem
There is a way around the CAP theorem though. Most applications only need consistency with the addition of new data or the modification of existing data. In other words, they only need the consistency from the storage system for a short period.
Working around the CAP theorem requires a storage architecture with two personalities. First it needs a consistent, high performance front end that delivered consistent access and acknowledgment of data additions and changes. Second, it needs a cost effective, available backend storage area that durably stores the less active data. The key ingredient is seamlessly linking these two components so that data seamlessly moves between them based on their activity level.
The realities of the CAP theorem seemed to provide limited options for IT professionals looking to move applications to the cloud. Either they needed to pay extra for consistent storage, or they had to leave applications that demanded high data consistency on-premises.
There is an alternative once IT professionals understand that even applications that need data consistency, the requirement for this consistency is only for data that is active. The inactive components of the data set, even if it is within the same physical file, can be stored on less consistent storage, since by its very nature it is not changing, consistency no longer matters. The key is to find a solution that can move data between the two storage types.
In our ChalkTalk Video, Storage Switzerland and Avere Systems map through the CAP theorem and provide ways to work around its limitations.