SHARE   

The recent SwiftStack announcement on Universal Access explains how the user is allowed to put and get objects using S3-API, Swift-API, NFS, and SMB (put one way/get in another way). I was exposed to many articles from respected outlets that interpret vendors who provide file access (natively like SwiftStack, or like others, using a gateway) as an attempt to appeal to NAS users in order to try and win the business of the likes of NetApp and EMC Isilon.

I’ll hold my “nothing can be further from the truth” line for later. But let’s do a quick review:

Managed File Systems (NAS) are really good at low-latency random write workload. “random writes”, which is a fancy name for updating a file, means that you don’t want to append at the end of the file, but change something already there. Random writes incur more latency because of the time to spin the disk platters and move the writing-heads. Sophisticated file layouts such as NetApp’s WAFL go an extra mile through sequentialization – random writes for an extra oomph. Look up WAFL and ‘log structured file systems’ in Wikipedia.

Another important trait of file systems (and NAS in particular) is locking. Any shared system needs to account for the option of 2 users trying to update the same file. Locking is a complex fit, especially in distributed systems, as no locking (or badly-implemented locking) can lead to data corruption.

A related but slightly different aspect is consistency. File systems use a strong-consistency model, in which the storage will not acknowledge a write until it was written to all needed areas. If a node in a distributed file system is half way across the world, the writer will need to wait until the write travels there and is acknowledged by the remote node.

NAS, being an alternative to a local file system (on the attached disk), needs to provide application using NFS/SMB with the response time they are used to from working with a local file system. That’s why distributed NAS is exponentially harder, and I salute all who managed to make it viable. But because of that, most distributed NAS solutions consist of machines standing side by side, no more than a hop away on a 10GbE network.

Object storage is not good for that. Object storage, from the start, did not aim at those workloads. Object storage, for example, threw locking out by making object immutable. You can write (PUT) them, you can read (GET) and delete them, but you cannot update them. If you must, you can read the entire object, make your changes and write it again. Also, if two users do that at the same time, well, the last version that was PUT will be the latest.

Applications using API calls are very good at mitigating high latency, as HTTP was designed from the get-go for these types of systems (hello, internet).

Most object storage systems use a eventual-consistency model, in which the storage will acknowledge a write after it was written to enough places that are considered secure. With SwiftStack, for example, the storage will acknowledge after the data was written to a majority. If you have a three-copies policy, SwiftStack will acknowledge after it committed two copies, with the other completing asynchronously after.

What makes object storage a good backend for a managed file system? Absolutely nothing. The appeal is obvious, though. We have a distributed storage system that scales infinitely and can span globally. If we just add NFS to the end-point, we’ll have a NetApp-killer, right? Wrong! Object storage is possible BECAUSE it threw away the basic assumptions of NFS (see above).

But why else, y’all are now thinking, would you add file access to object storage? Thank you for asking.

Object storage is very good at high throughput applications pushing immutable units of data, for later or for someone else to read. Think about a photo sharing site like Snapfish or Flickr. When you upload your photos, the web server takes them and passes them onto the storage. Then, SnapFish might decide they want to render a few more sizes of the image, so they will not need to do it on the fly. So, some other process reads the images and transcode them (or however you call rendering an image in different sizes) and puts those back on the storage. They might also want to run some face-recognition algorithm on it, or maybe GPS analytics, and so forth. None of those will change the data in place. They read it, process and write new pieces of data in. Those new pieces of data can later be carried by the system to those nodes across the world for better geo-latency (for the users).

But notice that in the description above there is no mention of S3 or API. A lot of applications use NFS or SMB because they were created for those protocols, but actually push data in an object fashion – many many files, that will never change in place, and will serve as the raw material for the next step in the chain.

And that is the use case. Application that use NFS have suffered from using the wrong type of storage for their needs, or the NAS storage suffered from misuse, until now. Our file access is meant to make those applications first class citizens in your workflow. No bridges, no gateways, no translators. Once an object is put it in, either through file access or object API, it will show up in storage just as its brother that entered by a different protocol. No separate format for objects and files – that defies the purpose (see above).

With that said, all you analysts saying object with file is a grab for the NAS market – you can keep doing so, and we will back you on that (and here I mean SwiftStack). Other object vendors might feel they can give a viable NAS based on object storage, and I wish them good luck with that.

To our customers and interested people – this is not what we do. In fact, we ask you not to replace your NAS with SwiftStack until you discussed it with us (remember those applications that use NFS by write objects?).

So, saying that object storage is trying to take over the NAS space with file access cannot be further from the truth. At least not here at SwiftStack. We want you to use your NAS for what it excels at, and we invite you to try us for what we excel at.

About Author

Ehud Kaldor

Ehud Kaldor

Ehud Kaldor started his path into storage in 2004, by joining a startup named Kashya, which was acquired by EMC in 2006 and became EMC RecoverPoint. At EMC, Ehud served as a CSE (corporate systems engineer) for RecoverPoint, and later led the CSE team for RecoverPoint. After 8 years at EMC, Ehud joined NetApp as an SE, and was part of some of NetApp's largest accounts' teams, working with cutting-edge customers. Ehud is a technology buff, and learns as much as possible on anything and everything. He dips his toes into container technology, programming, system administration, and likes to break stuff and put it back together. He is also an introverted extravert (or an extraverted introvert), which accounts for his ability to present to audiences along with his personal awkward social behavior. Ehud is currently an SE at SwiftStack.