Amazon Web Services (AWS) support a large number of use cases, ranging from high transactional applications to analytic processing applications, as well as backup and archiving solutions. As vendors and data centers decide on how they will use Amazon they also need to decide how they will store data in the Amazon Infrastructure. No matter the use case, these applications need to store data, but the performance of that storage and that its retention requirements will vary between these use cases.
To address this variety of storage requirements for these workloads Amazon has a portfolio of storage options that allows their customers to pick the right storage architecture for their specific workload. The choices range from high performance to low cost and many applications will use several of the available options during its life cycle.
The AWS Storage Portfolio
The first type is Amazon Elastic Block Storage (EBS), which is a persistent high performance block storage area designed for applications that need high performance and need those stores to remain intact even after the instances running the application are turned off or deleted. It is directly attached and dedicated to the instance.
The second type is Amazon EC2 Instance Store. EC2 is ephemeral, and once the compute instance using that store is shut down or deleted the data in the capacity in that store is related and the data is deleted. It is typically used for container compute instances.
Elastic File Service (EFS) is Amazon’s third type of storage offering. It is essentially a Network Attached Storage (NAS) solution in the cloud. EFS is a file system with file system semantics that presents more traditional protocols like NFS or SMB to cloud hosted applications that need them. It runs on top of EBS so it offers excellent performance and scales to Exabyte of capacity.
Amazon has two object storage offerings, Amazon Simple Storage Service (S3) and Amazon Glacier. Amazon S3 is ideal for applications that need to create and access large amounts of unstructured data. It allows for advance metadata tagging so users can find data faster in the future. Amazon Glacier is essentially a data vault. If the customer is willing to commit to less frequent access, Glacier can offer extremely competitive pricing.
Getting Data To Amazon
Amazon also has a variety of ways to move data to the service. Again, which path the customer uses to get data to the Amazon cloud is largely dependent on the use case. If the organization is looking to move applications to the cloud, then more than likely they will want to move all of that application’s data in one step. The problem is, of course, the latency and bandwidth of the Internet connection.
To overcome the Internet problem, Amazon offers Snowball and Snowmobile, which are appliances or trucks, in the case of Snowmobile, that are sent to the data center, so all data can be copied to them. The Snowball is then sent, the snowmobile is driven, to the Amazon data center for copying into its architecture.
For smaller data sets customers can use third party software solutions that provide optimized transfer across the Internet connection. For streaming data, Amazon offers Kinesis Firehose which allows the high-speed and continuous streaming of data into the Amazon service.
Customers may also choose to use a direct connection. A data center is established in close proximity to the Amazon data center and a direct connection is established between the two facilities. This option typically has storage on one side and compute on the other. For example, a data center with storage can provide a direct connection to Amazon compute on the other side. In this case the connection allows the organization to use more mainstream storage solutions while still having access to the almost unlimited compute of the Amazon service.
For customers looking to keep most of their data on-site, or at least their production data, Amazon has storage gateways and third party connectors. The Amazon Storage Gateway provides a file interface for organizations looking to either copy, backup or archive certain data sets to the Amazon service. There are also many solutions from Amazon partners that will make a similar connection each with their own unique value add. These third party connectors can be built into a purpose built appliance, or they can be added on to existing backup and archive applications.
Amazon is a massive service and it supports an equally massive number of use cases from structuring a data lake to migrating enterprise applications. To support this variety the service has to have a complete portfolio as well as count on third party partners to bring in additional capabilities. And Amazon has done all of that exceptionally well. They have a high performance block file system, with all the basic capabilities you’d expect, and they have partners that can add capabilities to it. They have a file system solution for applications that need to access data via more traditional file system protocols. And they have two tiers of object storage, customers can either write to directly or leverage Amazon partners to automatically move data for them.
While it is hard to single out a particular aspect of AWS as the reason for its unprecedented success in the market, the robustness of the storage portfolio is certainly one of the keys of it.