Demystifying: Open vs. Proprietary Storage Software

Demystifying: Open vs. Proprietary Storage Software

BLOG: By WARP Mechanics’ CTO.

This is a bit long winded. I’m trying to cover a wide range of industry FUD and misconceptions. If you’re looking for a specific FUD response, feel free to word-search down. 🙂

That said, if you’re a customer or storage architect looking at go-forward storage strategies, the full article is probably worth a read, whether you’re shopping for a WARP solution or not.

Introduction

It may be hard to believe in the era of commodity hardware, open software systems, “XaaS”, and “software defined ___”, but many legacy storage vendors are still encumbered with proprietary operating systems and hardware platforms.

Designed in the 1990s or even earlier, these systems have not been able to adapt to changing market requirements or implement new features.

In particular, legacy storage systems relying on proprietary hardware are notoriously slow to adapt: it may be literally impossible to apply feature updates or even basic bug fixes when these require re-designing custom silicon. Giving your storage controller a “chip-ectomy” isn’t generally an option, so customers are left with the “rock or hard place” choice between forklift upgrades, vs. living with known-defective gear.

Because this puts the legacy vendors at a competitive disadvantage, some of them have resorted to campaigns of “Fear Uncertainty and Doubt” (FUD) related to open storage software. In short, if you can’t solve a problem with correct technology, they opt to solve it with massive marketing campaigns instead.

Oddly, many of them are simultaneously spending literally billions of dollars acquiring or developing their own “Software as a Service” or “Software Defined Storage” products. For example, EMC spent billions acquiring Isilon, ScaleIO, and XtremIO, all of which are mostly freeware on commodity hardware, and all of which are now being heavily “pushed” by the same sales teams that previously said that only proprietary hardware and software could be trusted with your data.

This isn’t limited to 1990s era companies like NetApp and EMC. Even a few storage startups like Pure made the same mistake. Pure is an odd company in other respects: e.g., they’re hemorrhaging almost $200 million a year in losses, and have a tiny product portfolio, but want to go ahead with an IPO anyway. Again: If you can’t fix your product, the attitude appears to be, just spend money on marketing and it will all turn out OK.

If it were just about radically over priced IPOs, I’d say “buyer beware” and leave it at that. The problem is, with storage systems, “good enough” really isn’t good enough.

When companies like Pure inevitably spend themselves into bankruptcy, if they are running entirely proprietary systems, how will their customers get data off of the drives? When legacy storage OEMs  encounter defects in aging proprietary hardware, how will customers get bug fixes? If your company’s “crown jewels” data is sitting on a proprietary system, be it from an OEM or a startup, you are “bolted to the hip” to that company and that specific hardware, and have no recourse if the technology or the company goes wrong.

The above analysis isn’t specific to WARP. The trends away from closed systems and towards software defined storage, open systems, and X as a service, are broad and deep. At this point, any company ignoring this fact is in deep denial.

The good news of open storage approaches goes beyond lower cost. You actually get a better, safer solution.

Consider Linux vs. UNIX. Back in the day, the OEMs said you could never run a business on Linux, because only the big OEM proprietary operating systems could be fast enough and reliable enough. But today, the old school UNIX systems have a vanishingly small market share, because the Linux development community is bigger and better funded than any given OEM. Yes, IBM or HP have a lot of money and engineers, but neither AIX nor HPUX have more funding or development than the entire rest of the world combined, which develops Linux.

The same thing is happening in storage. Yes, EMC and NetApp spend a lot of money on R&D. (Though these days they mostly acquire companies when they want innovation.) But if you add both of their budgets together, it comes to a tiny fraction of the R&D going into open storage solutions.

From this, it’s clear why open storage solutions are dramatically out-pacing Nimble, Nimbus, Scality, Ceph, Pure, EMC, NetApp, etc., in terms of features. But open storage solutions are also safer.

Consider this. If you buy a WARP Mechanics ZFS-based NAS system, then decide you don’t like WARP support or software, you can literally install a competitor’s ZFS-based solution on our controllers and directly import the storage pools. You don’t need to throw away any hardware, or migrate any data, because the on-disk protection format is identical.

In short, there are technical and financial advantages to going with open storage solutions. Yet NetApp, Pure, Nimble, Nimbus, EMC, and others continue to churn out FUD based on 1990s technology approaches.

The bulk of this post (below) is intended to clear up some of that FUD specific to WARP Mechanics’ operating system, which relies on a combination of open software, and WARP-developed enhancements.

 

FUD: Proprietary software is needed for high scale

WARP does support proprietary parallel storage solutions. For example, you can run IBM’s GPFS software, or Quantum StorNext, natively on WARP hardware.

However, one size does not fit all. And in particular, where size is concerned, nothing beats open source solutions.

Proprietary software such as Ceph, Scality, GPFS, and StorNext has its place, but at the highest scale, WARP uses Lustre over ZFS. In our installed base, and in the broader HPC community, the largest filesystems all use this combination.

 

FUD: Only OEMs can be trusted with data because of their proprietary intellectual property

WARP has intellectual property. We just don’t feel it’s necessary to reinvent the wheel.

In fact, all storage offerings on the market today are combinations of in-house vs. open-source vs. contracted-out intellectual property. Do OEMs like EMC actually make hard drives? No, they buy them from companies like Hitachi, Western Digital, and others. So do we.

The reality check is that almost no OEM actually builds any of their own storage hardware — that’s always contracted out to an outsource manufacturer these days. WARP uses the same enclosure manufacturers, disk drives, and SSDs, that are used by the established OEMs. So there is literally no difference in hardware quality.

Similarly, on the software side, most storage vendors are using Linux as a base OS these days. Then they lock their customers out of the OS to prevent customers from seeing that they have done this.

WARP creates our own Linux distribution, just like other storage vendors do. And we have management tools that we wrote in house, just as others do. But we based our OS on CentOS… and we let customers know this, and maintain compatibility with the parent distribution.

The advantage? Aside from “honesty is the best policy”? 🙂

We can integrate pretty much any RHEL/CentOS/SciLinux compatible software directly onto our controllers, as we did with the ORNL/CERN system recently.

Long story short: WARP creates and maintains intellectual property, but only where it adds real value. You can trust your data to the ZFS layer, since ZFS has been storing mission-critical data for a decade now. Hence, there was no benefit to WARP to change ZFS itself: doing so would only have created risk for users.

 

FUD: “You need active/active controllers” (or) “you need active/passive controllers”

When a company has only one solution “on the truck”, they try to sell that as being the only solution you could ever want.

WARP supports a wide array of options: our products are like Lego bricks in a way: we can put them together to create a solution optimized for the particular customer problem, so we don’t need to put a stick in the ground and say “always do it like this.”

The reality is, sometimes active/standby controllers work better. Sometimes active/active controllers work better. We can do both. In fact, sometimes you want active/nothing controllers: not having HA is actually appropriate for archive systems, or for controllers used in an erasure code system. One size does not fit all in HA, and you should inherently mistrust any vendor saying otherwise.

 

FUD: You need XYZ proprietary management or analytics tool

As with other “open vs proprietary” trade offs, you get more choice and power on the open side.

Look at it like this.

WARP’s OS is derived from RHEL/CentOS, and we give customers the “keys” to the underlying OS. This means you can manage a WARP with any tool that works with RHEL/CentOS.

Yes, we include a bunch of tools in the OS for you, which we developed and support. So we have management, monitoring, and analysis capabilities built in.

But if you don’t like our tools, or if you already have a large orchestrated Linux environment, then we can help you integrate our system with other tools.

This gives you simply the widest range of options of any storage vendor. You can take our turn key solution, or you can adapt our system to use other tools: It’s up to you.

 

FUD: You need feature X (compression, snapshots, etc.) so you need proprietary system Y

This is an easy one: We have that feature. 🙂

I don’t really care which feature you’re talking about. If it’s useful, the open source community has it already.

In fact, the proprietary systems all fall behind in this respect.

For example, the Nimble answer on deduplication is pretty much “sour grapes”: They haven’t got it, and can’t build it, therefore they say you don’t need it. QED.

Our approach is to enable the industry’s widest feature set, then have our experts work with you to figure out which combination is best for your environment.

 

FUD: You need scale out storage

As with active/active vs. active/standby controllers, the reality is… sometimes you want scale out, sometimes you want scale up, and sometimes you want both. WARP offers all three options. We just don’t push one specific approach on customers until we figure out what they actually need.

But let’s say you really need scale out.

WARP’s default scale-out system is based on Lustre. That same software has been used by the largest super computing systems in the USA to deliver 10s of petabytes per namespace. In fact, Lustre is the only storage solution proven to scale to that size, while maintaining performance.

In contrast, Nimble can scale to (ahem) four controllers, and Isilon can (theoretically) scale to a bit over two petabytes… although performance drops like a rock far before that level.

OK, you don’t want Lustre for some reason? We support GPFS, GFS, Swift, Ceph, StorNext, and more. Any of these can scale to petabytes if implemented by a knowledgeable team such as WARP Professional Services.

 

FUD: You need XYZ proprietary object storage

This one is puzzling for two reasons.

First, very very few storage applications need, or even benefit from, any kind of object storage solution at all. The vast majority of storage solutions need to be file or block, and using object storage simply because it’s trendy is a severe “square peg round hole”.

Second, proprietary object systems have the worst of all worlds. Products like Scality have (ironically) proven to be totally un-scalable, they have no installed base with which to have debugged their products, and they haven’t even worked out their baseline software architecture. The companies putting out this software don’t have unlimited capital, and are “one trick ponies” — if their solution doesn’t sell on a massive scale, they will fold, or be acquired and relegated to obscurity as happened with Ceph.

Buying one from an established OEM is no safer: OEMs cancel failing business units all the time, and they don’t have any better answer to scale or reliability than the startups.

So if you implement a proprietary object storage system, you’re actually taking massive risk, and likely are doing so for no technical gain.

WARP’s approach has been to use ZFS to create a horizontal scale-out erasure code layer. The “WARP-Z” product doesn’t have any of the scale or reliability limits of object platforms, and is entirely based on well-debugged open software technology. It’s fast, safe, and presents as POSIX compliant block and file storage instead of difficult-to-use objects.

Basically, the only reason to implement proprietary object storage solutions is if you own stock in the company selling them.

 

FUD: You can only trust your data to publicly traded companies

This is a bizarre one, but it keeps cropping up.

The notion is that, if a company lists its stock on NASDAQ or NYSE, this somehow impacts the quality of the technology being offered.

In reality, most businesses world wide are not publicly traded, and seem to work just fine.

Do you get poisoned food from the local corner grocery store because they aren’t on the NYSE? Does your local dentist botch your fillings because his practice isn’t on NASDAQ?

Frankly, there is no connection between having public stock vs. having a reliable product. The local grocer buys  food from the same farms that supply the major chains… or even better, if they offer premium local produce. If anything, you get a better result from a company that relies on reputation and quality for its business, rather than relying on a massive marketing campaign whenever they mess up.

WARP puts its systems together using the same HDDs, SSDs, CPUs, etc., that OEMs use. Our software stack only contains elements vetted by the much larger open software community, or which we developed in house to simply improve the manageability and performance of said open software.

So, like the local grocer, if you buy from a privately-held specialist company, you actually get a better result than if you go with a factory-farm approach to storage.