Businesses continue to migrate a growing percentage of their workloads and data to off-premises cloud services, but this is not the most appropriate fit for all. Application requirements in areas such as performance, compliance and elasticity are splintering, causing IT to turn to hybrid cloud infrastructures that mix on- and off-premises cloud infrastructure resources to most effectively meet service level agreements (SLAs).
One challenge that emerges during the shift to hybrid cloud is around the concept of “data gravity.” Whereas compute instances can be started and stopped across infrastructure nearly instantaneously, data cannot be transferred as seamlessly because it has a physical component. This is not typically an issue in an on- or off-premises only model because the compute and data are relatively static in terms of their location. In hybrid cloud architectures, however, it becomes a significant pain point as IT works to enable data portability across a multitude of cloud services – largely because of metadata.
Metadata is, essentially, data about data (file location or type, dates accessed and modified, etc.). Because all storage systems create metadata and because it is foundational to data access, metadata (although small in size) comprises a majority (more than two-thirds) of data transfer I/O requests. At the same time, it is typically deeply embedded in the operating system, requiring the system to dig through many layers of the file to access the metadata – and to then do the reverse to respond to the request. As a result, metadata is highly sensitive to latency.
Consequently, metadata becomes a performance bottleneck in the public cloud, which introduces even more latency. Each I/O operation must wait for the cloud network and processing resources to respond to the request. In a hybrid cloud environment, this pain point becomes exacerbated because each infrastructure tier provides a massively different level of performance. Additionally, most solutions require the entire file to be retrieved from the cloud to meet a particular metadata request – which can drive up data access and other egress fees in a cloud model.
It is far easier to transfer operations to the cloud, as opposed to transferring data to the cloud. Metadata has a substantial impact on performance, and is often overlooked by IT and by hybrid cloud storage vendors alike. To learn more about this dynamic – and about how abstracting metadata into the network can solve this pain point – access Storage Switzerland’s recent webinar with Infinite IO, The Hybrid Cloud Data Gravity Problem and How to Fix It, on demand.