Object storage is set to become THE way organizations will store unstructured data. As this transition occurs those same organizations are expecting more from object storage than just a “cheap and deep” way to store information. They expect the system will deliver data as fast as their analytics applications want it. The problem is that in terms of performance most object storage systems are sorely lacking. The reality is the transition to high performance object storage will require more than simply throwing flash at the problem. Underlying object storage software needs to change.
More than Flash
Our entry “The Need for Speed – High Performance Object Storage” shows the decisions to use flash for object storage gets support from improving time to results and increased density. The problem is that “just throwing flash at the problem” will lead to less than desirable outcomes. The key to optimizing a flash investment is making sure the rest of the storage infrastructure does not add the latency that flash removes. This is a particular problem for many object storage systems.
The Object Storage Bottlenecks to Flash Performance
One of the key inhibitors to maximizing flash performance is one of object storage’s biggest advantages; its rich metadata capabilities. The problem is the management of object storage metadata can create significant latency. Organizations looking for flash performance from their object store need to make sure that the system is efficient in metadata management and has the ability to store metadata catalogs on flash or even DRAM.
A second key inhibitor is data protection. Most object stores use either replication or erasure coding to provide protection against a media or node failure. The problem is these schemes consume CPU power, the same CPU power the object storage system uses to run other aspects of its software like metadata management. Vendors need to rethink their protection protocols so they are both network efficient and storage CPU efficient. The key is to limit the re-transmission of redundant data as much as is possible.
A third key inhibitor is lack of hardware flexibility. In the end, object storage software is at the mercy of the processing power of the hardware it runs on. The problem is most object storage systems are tied to a specific hardware relationship and they are slow to adapt new, more powerful processors. To keep up with technological advancements, like flash storage, it is important to select object storage software that can run on a variety of physical server hardware.
StorageSwiss Take
High performance object storage makes analytics more valuable by delivering rapid answers to requests. Object storage, powered by flash and support by software optimized for it, enables more frequent analysis across a broader spectrum of data which leads to more insightful and accurate results.
[…] To read this blog in its entirety, please visit: https://storageswiss.com/2016/10/04/designing-all-flash-object-store/ […]
Well, Mr. Crump raises some interesting issues with regard to object storage performance. Let’s start with replication and erasure coding. The parameters for replication and erasure coding can be varied based on how much durability is needed to protect against data loss. Writing and reading data using replication is generally faster than erasure coding. Bucket policies can be established to initially replicate data and then erasure code it after a period of time. It is hard to see any technical improvement in replication, but erasure coding does allow for improvement using approaches like hierarchical erasure codes.
With regard to network performance, all object storage systems can locate data objects very quickly. Production storage servers generally use dual,10 GbE interfaces, which are recommended by most object storage software vendors. The option to use faster Ethernet speeds beyond 10 GbE may appear in reference architecture designs for object storage clusters in the future.
Storage servers tend not to be CPU-bound and most of them can do quite well with a single, multi-core CPU. They do need adequate amounts of RAM which is usually based on the storage capacity of the server. Performance in object storage clusters improves as the number of storage servers in the cluster increases. More storage servers yield better performance.
Object storage software is almost universally hardware agnostic. Storage clusters can comprise a heterogeneous mix of new and old servers. Being storage hardware agnostic means that new and replaced storage servers can have the latest and greatest in storage server capabilities and performance. Few, if any, object storage software vendors limit or constrain their deployment to a specific vendor’s hardware. Many have partnerships with hardware ODMs but not exclusive relationships.
When new storage server architectures are developed, object storage software vendors will develop their software to take advantage of it provided that the new architectures are not proprietary to a particular vendor. It will be interesting to see how object storage software vendors will respond to emerging flash memory architectures.