Using the cloud for the long term storage of data seems like the perfect plan. An organization can buy Petabytes (PB) of capacity on a periodic basis and eliminate the upfront costs associated with buying a storage system that can support that type of capacity. They can also eliminate the power and cooling costs associated with storing that much data on-premises. Cloud storage isn’t perfect however. There are concerns with cloud storage that organizations are still trying to rationalize. While most organizations focus on data security, a legitimate concern, they should also pay attention to the cumulative impact of periodic storage costs. A balanced approach, one that leverages on-premises and off-site storage, may be the best for most data centers.
The Need for Something New
IT professionals are well aware of the increasing capacity requirements of unstructured/file data. What may surprise them is the rate at which that growth is occurring. This growth rate is expected to accelerate in the coming years thanks to increased use of sensor data from Internet of Things devices, increased rich media and the increase in user created data. There is also increased demand to have this data more accessible from a variety of locations and retained forever.
The traditional solution for storing this data is Network Attached Storage (NAS) systems or file servers, and they are simply not up to the challenge that unstructured data presents. IT professionals need a solution that is easy to scale, cost effective and can provide long term data durability.
The Value of Cloud Storage
The demands of unstructured data have IT professionals turning to cloud storage solutions as the way out. Most cloud storage providers advertise a solution that can scale almost infinitely, that’s paid for as it’s consumed and can continuously verify data so that the solution can guarantee that data written to it today will be readable in the future.
Recurring Costs – The Cloud Storage Problem
The challenge that organizations will face when dealing with cloud storage is the recurring cost. The first payment for that first PB of storage is a great deal, the 50th payment, especially when factoring in growth, can get expensive. The key considerations are first, how much data will the organization store in the cloud, and how long will it stay there? As an organization’s in-cloud data requirement crosses the 100TB mark and retention times reach decades, they may need to reconsider how they will use cloud storage.
Cloud storage providers are quick to factor in so-called soft costs into the overall cost of a comparable on-site solution. Organizations should certainly consider soft costs, but there are certain costs that will exist in either scenario. For example, the cost to identify and move data to the secondary storage system is the same for both cloud and on-premises data. The cost to identify data for restoration is the same and the cost to recall data from the cloud may be more expensive than an on-premises option. Additionally, on-premises solutions can “borrow” cloud storage architecture concepts and vendors can deliver them as a package to private data centers.
Security Concerns Continue
While the recurring cost of public cloud storage is a top concern, another concern is security. While most cloud storage providers supply encryption of data both in flight and at rest, there is an ongoing concern about encryption key ownership. If the cloud provider maintains ownership of the key or can recreate the key without the subscriber’s permission, they could obtain access to the data. Cloud providers are under increasing pressure from the federal government to provide access to even encrypted data.
Cloud Latency Impacts Performance
The final issue is the impact of network latency when interacting with the cloud. The most obvious concern is when uploading or downloading large amounts of data. While bandwidth to and from the cloud is rapidly increasing it is unlikely that it will ever be as fast as the private network. In addition to bandwidth, there is simply a “speed of light” issue. The further the cloud provider is from the actual data center the longer those transfers will be.
Bringing the Cloud into the Data Center
Success in the cloud storage market is based on driving down the cost per GB and driving up the capacity managed per employee. The large cloud storage providers have achieved their success by creating scale-out storage software that runs on commodity hardware. Data centers should borrow from the success of the large cloud providers and create their own on-premises private cloud.
Most data centers can’t afford, nor have the expertise, to develop their own scale-out storage software. The good news is they don’t have to. There are a number of off-the-shelf software solutions that can be coupled with commodity server hardware to create a scale-out storage solution. For data centers that don’t have the time to perform even that level of integration, turnkey software/hardware bundles are available that will essentially create a private cloud in a box type of solution. With this approach, a data center can deploy a private cloud storage solution in less than a day.
Like public cloud storage solutions, these private cloud storage solutions are object based so they can scale to meet the demands of Internet of Things (IoT) data and other forms of machine-generated data. Also, some of these private cloud storage solutions support more traditional data center protocols like CIFS, NFS, and iSCSI which allows data centers to use the storage platform for both traditional storage demands as well as demands created by new applications. As a result, private cloud storage solutions can be the consolidation point of all unstructured data from both legacy and next generation data center use cases.
As mentioned before, most of these solutions are scale-out in nature, which means they expand by adding nodes (servers with internal storage capacity). These nodes are clustered together and managed as a single unit. Each additional node brings additional capacity and performance to the cluster and the ability to scale to meet the demands of various unstructured workloads. Additionally, since the entire storage cluster is managed as if it were one unit, a single IT administrator can manage PBs of storage.
A private cloud storage system should also be “software defined”. Software defined does not necessarily mean that the solution must be software only. Ideally, the solution should be available as software or have the option to be pre-integrated with hardware. Cloud storage software pre-integrated on an appliance enables rapid implementation. As the IT team becomes more comfortable with it, they can leverage the software only capabilities and integrate their hardware components for maximum flexibility and cost savings.
Private cloud storage can overcome most of the disadvantages of public cloud storage. While its upfront costs are higher, it becomes less expensive over time since the organization is an owner of the storage instead of a renter. Private cloud solutions have the same pay as you grow capabilities as the public cloud and since it is behind the organization’s firewall, the public cloud’s storage system brings a potential for greater security. Finally, because it has the advantage of being on-premises, access latency is lower thanks to local network speeds instead of WAN connectivity, as well as virtually no distance issues to deal with.
A Need for Balanced Cloud Storage
Despite the advantages of a private cloud storage solution, most organizations should leverage both the private and public cloud. For organizations with less than 50 TBs, a public cloud storage solution’s total cost of operation is typically less expensive than investing in a private cloud storage solution. For these organizations, the public cloud is a great place to start, assuming that the security and latency concerns are not an issue for them.
As the data set at these organizations grows, and certainly for organizations that are already beyond 50 TB’s of capacity, the hard cost advantages of owning versus renting become obvious. These advantages surpass the operational advantages that the public cloud storage solution may have, especially if the data center uses a private cloud storage solution, which has many of the same operational benefits.
Even after the private cloud storage system is implemented, there is still a useful place for a public cloud storage solution. One example is to use the public cloud as an extension of your private cloud, also known as hybrid cloud storage. The ability to expand your existing storage resources with public cloud storage allows an organization to treat storage as a just-in-time inventory item. It provides a buffer to support any shortfall in capacity. With hybrid cloud storage, the private cloud could temporarily place data into the public cloud until the implementation of additional on-premises capacity.
Hybrid Cloud Storage requires that the private cloud solution integrates with public cloud solutions. For example, the private cloud storage solution should have the S3 protocol built into it and built in archiving/tiering of of data the public cloud seamlessly.
For most organizations, public cloud storage is a great place to start but not where they should end. While there is an initial operational advantage to the public cloud, the long-term cost of renting TBs of capacity is too expensive. A private cloud storage solution can leverage similar concepts of public providers to keep the hard and soft costs under control while gaining the advantages of owning vs. buying, increased security, and localized performance.
Sponsored by Cloudian
Cloudian specializes in developing enterprise-grade smart data storage. Cloudian introduced HyperStore in 2011, a breakthrough object storage platform. Cloudian HyperStore is 100% Amazon S3-compatible and enables service providers and enterprises to build reliable, affordable and scalable hybrid, private and public, cloud storage solutions. With integrated appliances and software, they provide simple building blocks to deploy scale-out, always “on” storage for enterprises and service providers worldwide.