Should Object Storage Systems use a Filesystem? – DDN WOS Briefing Note

Object storage systems store objects and file systems store files. While an object and a file are typically the same thing, how they are stored is not the same; it is the difference between an object storage system and a file system. So why, then, do many object storage systems use a file system underneath?

Using a file system underneath an object storage interface doesn’t make it a file system, any more than an NFS interface on top of an object storage system does that. But why would a storage system being written to store objects store those objects on a file system, since many of the limitations that object storage systems are attempting to overcome are artifacts from the file system architecture?

The answer to this question is probably simplicity and time-to-market. A lot of object storage systems are built on commodity hardware and use a commodity Linux distribution, upon which they run their object storage system. That Linux distribution will come with a built-in file system, so it’s easier and quicker to build an object storage system on top of a file system. It could also be said that having a file system inside one node does not quite have the same limitations as having a single file system that spans dozens or hundreds of notes.

That is not to say that there are not efficiencies to be had by stripping out the file system and writing to disks natively. This is why it was interesting to learn that DDN’s Web Object Scaler (WOS) object storage system uses NoFS instead of a file system. They claim that this results in an immediate 15 to 20% reduction in storage overhead, reducing acquisition costs, rack space, and power and cooling costs. They also claim that it requires 4 to 10 times fewer IOPs for each read and write. That could have significant performance advantages. A DDN system can start as small as a two node cluster, which can then be grown to a cluster of 256 notes. Up to 32 clusters can be combined into a single namespace that can hold hundreds of billions of objects. These statistics apply to both their hardware versions and their software-only version, which DDN also announced now accounts for 10% of their object storage revenue. This is significant since they only started marketing this product a year ago.

StorageSwiss Take

Having a file system underneath an object storage system does result in storage and performance inefficiencies, so it is commendable that DDN’s object storage does not use a file system. Disk is never free and IOPs are never free, so a 15 to 20% reduction in file system overhead and a 4-10X improvement in IOPs are both significant. Other object storage vendors should take note.

W. Curtis Preston (aka Mr. Backup) is an expert in backup & recovery systems; a space he has been working in since 1993. He has written three books on the subject, Backup & Recovery, Using SANs and NAS, and Unix Backup & Recovery. Mr. Preston is a writer and has spoken at hundreds of seminars and conferences around the world. Preston’s mission is to arm today’s IT managers with truly unbiased information about today’s storage industry and its products.

Tagged with: , , , , ,
Posted in Briefing Note
2 comments on “Should Object Storage Systems use a Filesystem? – DDN WOS Briefing Note
  1. Tim Wessels says:

    Well, Mr. Curtis might explain where DDN’s NoFS architecture came from for starters. He mentioned it like everyone should know all about it. A casual Google search doesn’t turn up much but does score some hits for the No Order File System, which is not that same thing as the NoFS disk architecture DDN mentions on its website. That said, Caringo has a similar approach in that they don’t use a file system and do claim, like DDN, to get high utilization percentages out of the HDDs in a Caringo Swarm cluster.

    Many object storage vendors use Linux as the operating system for the nodes in their storage clusters. Caringo uses Debian on their boot devices. Cloudian HyperStore uses CentOS on its cluster storage nodes. Cleversafe has their own Linux variant called ClevOS. SwiftStack supports CentOS, Red Hat, Oracle Linux, and Ubuntu. Scality Ring supports CentOS, Ubuntu, and Debian.

    In the case of Cloudian HyperStore, each disk drive in a cluster node has an ext4 file system. When I asked the VP of Global Engineering at Cloudian why they used a file system, he replied that it is one less thing to have to maintain, which implies that if you don’t use a file system on your storage nodes you have to take more responsibility for the development and testing of your full application stack than you would otherwise.

    Since the DDN disk utilization and IOPs performance gains are not compared or verified in this article, it looks more like a matter of choice from a development perspective whether to use a file system or not. Of course, DDN does sell into the HPC market and maybe squeezing everything your can out of disk space and IOPs makes it worthwhile. For the typical object storage customer, it may not make much of a difference.

  2. wcurtispreston says:

    AFAIK, NoFS is their own branded product. It came from DDN. Yes, Caringo uses a similar approach.

    I would modify your statement in the last paragraph to say that their claim remains unproven. Sure, it’s a choice. But if it’s a choice that results in a 10% overhead reduction and a 4-10X reduction in IOPs, that’s kind of a big deal, don’t you think? I wouldn’t call the later “squeezing everything..” Since performance is all about IOPs these days, reducing IOPS by 4X means 4X the performance with the same disks/SSDs.

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22,263 other followers

Blog Stats
%d bloggers like this: