The Requirements for Cloud Storage Infrastructure

Posted on July 31, 2012 by Eric Slack

Cloud storage is becoming a ubiquitous term, subject to a wide variety of definitions. In the context of this article cloud storage will refer to a storage service that’s provided by an organization to storage users. That organization can be a commercial venture – a “public cloud” provider with subscribers – or run internally by a company as a “private cloud” storage service to its employees. Whether used by subscribers or employees, the storage infrastructure demands are largely identical for these two use cases.

This infrastructure needs to be a flexible, expandable system that can provide the required capacity, with good, consistent performance. On the front end, users need ‘one click’ storage growth in a pay-as-you-go format, but there’s more that this storage infrastructure needs to provide.

The Cloud as a Utility

Users now think of the internet and internet-borne services as they do a public utility. Like the power company, capacity is assumed to be endless, and uptime is a given. Regardless of the application, users want to see their files and expect access to them day or night. When something goes wrong they assume they won’t be impacted by it and if so they expect prompt, efficient resolution. And they never expect to lose data; after all, most cloud storage services are providing data protection in one form or another, so they represent the last line of defense for users.

Providers’ Requirements

For the organization/provider, expectations are even greater. They want simple operation of the storage infrastructure and effective management of the front-end billing to subscribers or charge-back to employees. They expect a robust API and ISV ecosystem to make service as easy as possible. They also want strong management and monitoring capabilities, like user authentication and account metering for clients, subscribers, departments or employees.

The user may treat cloud storage like it was a public utility, with unlimited capacity and always-on service – but the organization behind the cloud has to make that happen. This means maintaining data integrity, providing unfailing data protection and delivering a consistent quality of service. And these assurances must be maintained as the infrastructure expands; the cloud can’t outgrow its service levels.

Capacity

As a ‘storage utility‘, capacity must be available to users on-demand, which means administrators must be able to scale storage on the fly, transparent to the users. Provisioning more capacity to users should be instantaneous so that one admin can support hundreds of users and petabytes of storage. The infrastructure must remain viable throughout the long life that today’s data has. This means automatic data migration or data-in-place migration to accommodate advancements in hardware, without impacting service to users. It must support multi-tenant configurations so that users’ data is secure on the same physical storage system.

Economics

In order to meet the cost expectations of public cloud providers and the ROI requirements of companies running private clouds, costs per user and per GB of storage is critical. Here, most legacy storage infrastructures are at a severe disadvantage. Instead of proprietary hardware, scale-out, software-based systems can leverage non-proprietary, commercial off-the-shelf components to keep costs down. As software solutions, they can be run with any available server and storage hardware and storage nodes can be easily upgraded as capabilities increase and prices come down.

User front end

Rather than an agent or separate piece of client software to install, specific to accessing the cloud storage device, these solutions should leverage a web-based user interface (UI). Then, access through a standard browser can provide a self-service experience for users to add storage and perform management functions over their data. Additionally these cloud storage infrastructures should support API-based interfaces so that independent software providers can efficiently integrate applications or utilities, enriching the user experience and even eliminating the need for a web browser if they so choose.

Management Layer

For the cloud service provider or private cloud administrator these scale-out cloud solutions should provide web-based management of tenants and easy integration to billing and authentication systems with an API-based interface. They should also offer extensive support for third-party software providers and give cloud administrators the ability to script repetitive processes.

Back-end Storage

On the back end this cloud infrastructure needs to leverage the advantages of object-based storage. An object-based architecture uses a flat index of data object identifiers that enables virtually unlimited scale under a single namespace, simplifying management while making data portable. All that’s needed is the object identifier to access the data regardless of location. In contrast file system architectures use a hierarchical structure of directories and file system i-nodes, which ultimately leads to limits in scalability, fragmentation of the namespace and confined access. The simple “put”, “get” and “delete” commands common in an object based storage system also make integration into existing applications significantly easier.

Data Protection

For any cloud storage solution data protection should be the most important service offered. This is often challenging considering the wide range of file sizes, exponentially scaling capacities and ambiguity surrounding growth associated with providing a multi-tenant and multi-use-case service. Certain storage solutions are developed to protect large files and large capacities but are inefficient at storing small files and capacities. And the converse is true as well.

The Ideal Solution

In order to meet these requirements, service providers and organizations looking to offer cloud storage need more than a legacy storage system. They need a complete solution, the complete stack, ideally from a single manufacturer. Caringo, for example, provides object storage software (CAStor) as the foundational cloud storage architecture in addition to a cloud storage API and admin/user interface (CloudScaler). By controlling the complete stack they can ensure optimization between all parts of the solution and ensure the lowest cost, since users don’t have to license each piece separately.

Cloud storage is an easy concept to talk about but a difficult thing to implement. Delivering a cloud storage service means providing utility-like scale and uptime in virtually unlimited capacity, while keeping data safe and costs under control. Traditional block storage arrays and NAS systems can’t scale large enough or provide the economics. What’s needed is an infrastructure with scale-out storage hardware, an object-based architecture and a comprehensive web-based management and user interface. Caringo’s CloudScaler with CAStor integrates all these components into a single stack to simplify operation and ensure resource optimization throughout the stack.

Sponsored by Caringo

Click Here To Sign Up For Our Newsletter

About Eric Slack

Eric is an Analyst with Storage Switzerland and has over 25 years experience in high-technology industries. He’s held technical, management and marketing positions in the computer storage, instrumentation, digital imaging and test equipment fields. He has spent the past 15 years in the data storage field, with storage hardware manufacturers and as a national storage integrator, designing and implementing open systems storage solutions for companies in the Western United States. Eric earned degrees in electrical/computer engineering from the University of Colorado and marketing from California State University, Humboldt. He and his wife live in Colorado and have twins in college.

Tagged with: Caringo, Cloud, Data Protection, Infrastructure, Object, Object Storage, Protection
Posted in Article