The cloud is an ideal storage area for the 90 percent of an organization’s data that is not active. Moving this data to the cloud frees up on-premises storage requirements as well as data center floor space. The problem is most solutions just dump this data in the cloud, taking no responsibility for how that cloud based data is retained or leveraged in the future.
Unlike archiving, which hopes that data will never be accessed again, data preservation knows it will. When that access occurs the organization needs to have confidence the data is readable and they will be able to prove a chain of custody. Cloud-based data preservation should also take advantage of the fact that most cloud storage providers are also providers of compute. Cloud data preservation should be able to move data into the applications that these providers offer so that the organization can leverage cloud compute to analyze, search and convert this preserved data as needed.
Without the ability to leverage cloud applications to process and manage data preserved in the cloud, organizations are forced to move the data back on-premises in order to get value out of it. This means they can never really reduce their on-prem storage investment. But by leveraging the capabilities of the cloud that are already there, organizations can leverage almost limitless cloud compute to derive answers faster and eliminate the need to move data back on-premises.
Introducing CloudLanes
CloudLanes provides a cloud preservation solution. The company installs a virtual appliance that provides a POSIX-like files system it calls EdgeFS. EdgeFS can receive data from NFS, SMB, VTL and S3 data stores. iSCSI and FC data storage can write data to a virtual tape library. Once stored data is replicated to a cloud provider like Amazon, Azure, or Google. After the initial replication is complete, EdgeFS uses a transaction log to send transactional updates to the provider. All data can be compressed and encrypted before being stored and transferred.
Once transferred CloudLanes has developed a cloud based data verification service that checks for data integrity, inclusive of the cloud “back door” scenario, where malicious or unintended data modifications or deletions to data may happen outside of the application or CloudLanes file system knowledge, directly in the cloud account(s). This service can be used (also on a schedule) to catch transmission, rot errors and malicious actions with respect to source data going to cloud as well as for data that is being preserved in the cloud.
Data stored in the cloud is stored in the CloudLane’s File System (CLFS), which provides a cloudified chain of custody. One of the key attributes of CLFS is its ability to transform data into cloud hosted applications. For example data could be pushed into a cloud based object store and be processed by a cloud hosted Hadoop cluster by leveraging their cloud connectors. If the data is image based then a cloud based image recognition engine can provide analysis of video data, again using their cloud connectors. If the data is documents then it can be fed into an application like ElastiSearch to provide context sensitive search detail.
Their platform also provides control over what data goes to the cloud and when via user specified settings.
No Lock-In
Another challenge with a both on-premises and cloud based archive or preservation systems is vendor lock-in. How does an organization get its data out of a provider’s environment? CloudLanes has the ability to both migrate between and mirror between clouds. And their own software has an export function so the organization isn’t locked into CloudLanes either.
StorageSwiss Take
The vast majority of cloud archive solutions assume that when you need that access to cloud based data that you will de-migrate it from the cloud. The problem with the demigration strategy is that it may take time, a lot of time to move data back on-premises, especially if the request is large. Then the organization needs to continue to have on-premises storage budgeted for a cloud recall. Cloud preservation solutions like CloudLanes take a different approach by leveraging cloud based applications, the organization may never need to recall data from the cloud. They also get the ability to move data between clouds eliminating the vendor lock-in concern.
The CloudLanes platform doesn’t force you however to migrate all data to the cloud, the user has control over what data goes to the cloud and when. The ability to feed this data directly into available cloud based applications via their connectors may be valuable for enterprises that want insights, trends or other value added services besides just data preservation in the cloud.