For organizations with multiple offices where employees need access to the same data at the same time, leveraging the cloud for unstructured data storage make sense. Not only does it help with data distribution, it also helps ease data management and provides a powerful business continuity plan for unstructured data. But public cloud storage by itself is not enough. In addition to the obvious concern over latency there is also the issues of data distribution and overwrite protection.
Solving the Latency Problem
The solution for the latency problem is to somehow cache the active and even near-active data set locally. A relatively small cache (10% of data) can make cloud retrievals a rare event. But this on-premises appliance has to do more than simply cache data and make a connection to the cloud. It needs to provide the latest NFS and SMB protocol support, integrate into active directory and take advantage of the latest hardware advances like NVMe base flash.
Distributing Data
The next requirement is the solution has to enable the placement of appliances in multiple locations. But these appliances need to be managed centrally and intelligently. Since they are caches, they may have similar data cached in them as other caches throughout the organization or they may have a unique copy.
The software needs to manage the distribution of data. If there is a file that multiple offices need, then it needs to be placed in multiple caches. If a user travels from the New York office to the London office, their most recent files should be cached in the London office the moment they log in.
The software also needs to adhere to data sovereignty laws. If a European Union office, for example, has data the law does not allow the office to store in a U.S. office, the software has to have a policy in place that makes sure data does not end up in a U.S. cache.
From the user point of view, the system, regardless of the number of offices and caches, needs to appear to be a single unified NAS. Users don’t want to have to think about which location they need to login to to get to their data. They should be able to access anything they are authenticated to get. The solution needs to provide a global name space that spans all of its offices and the cloud so that the user only sees what seems to be a single NAS environment.
Content Control
The more collaborative the environment the more important the solution is able to control content so that changes in one site don’t overwrite changes in another. The solution needs to provide a distributed, global file lock. So if two users into different offices access the same file, the second person with access gets a notification that another user has the file open.
Take it Private
The cloud is an ideal way to distribute data but some organizations may not want to have their data in the public cloud. For others, the cost of “renting” storage capacity becomes too expensive over time. In both cases it may make sense for the organization to own its cloud. Object storage systems can provide many of the benefits of public cloud storage, but are installed on-premises.
The solution needs to work with private clouds in addition to a variety of public cloud providers so the organization can choose the environment that makes the most sense for the organization.
StorageSwiss Take
Cloud storage (public or private) is an excellent way to distribute data. It is ideal for unstructured data that needs collaboration from multiple locations. But the actual storage is only half the solution. The other half is the addition of intelligence to manage distribution of data as well as insuring a user does not accidentally overwrite data.
To learn more about how cloud storage can help solve an organization’s unstructured data problem, watch Storage Switzerland’s on demand webinar, “Overcoming the Top 3 Challenges of the Storage Status Quo”.