SHARE   

According to analysis from Forrester, 68% of business buyers prefer to research on their own, online. But what if you’re trying to determine what an enterprise product will cost? Even if you ask a technology provider about the cost of their offering, you’ll likely hear silence followed by “well…it really depends”. Technology pricing is mostly mysterious and you won’t get a complete answer until you spend lots of time with and offer lots of details to a given vendor. And, if your budget for a project is $100K and the quote is for $500K, then a lot of time has been wasted on all sides.

The goal of this article is to provide some transparency about the pricing for hybrid cloud storage (also known as object storage). Even more specifically, we will analyze what it costs to build a software-defined, scale-out storage system to your specifications.

Hybrid Cloud (Object) Storage Building Block Costs

The building blocks of a software-defined, hybrid cloud (object) storage system are fairly simple:

  1. Software that drives the system.
  2. Servers that the software runs on to store and manage the data.
  3. Networking components needed for a scale-out storage system.
  4. Public cloud capacity (optional) if you desire a hybrid design, where data can be placed across your private data centers AND a public cloud.

Object Storage Cost Building Blocks

Once you understand the cost of these four primary components, you can roughly estimate the cost of a complete software-driven storage system. In each of these sections, we explain the common options and will use the SwiftStack platform as an example.

Software

Storage software is often priced based on how much physical capacity (raw or usable) will be deployed, and the license is either perpetual or subscription-based. Most modern infrastructure software is now sold through a subscription model, while a high-percentage of legacy solutions are only available via perpetual licenses. The software subscription provides the right-to-use license, all maintenance updates, and 24×7 technical support. Two factors are required to create the price for the subscription:

  1. Capacity = the amount of data that will be managed by the storage system. It is important to understand if the solution you are evaluating charges based on unique data stored or all data stored. For example, with SwiftStack, there’s no extra charge for data protection and/or number of sites. So, if there is 500TB of unique data and that data is being replicated across 3 sites, the subscription will be calculated based on the 500TB figure.

  2. Term = the length of the subscription, with 12 months as the minimum for most products. If you are able to purchase a multi-year subscription up-front, there are often price discounts available depending on the length of the contract. Some organizations prefer a “grow as you go” approach, so they’ll start with a 12 month term and renew the subscription based on how much data they need to manage for the subsequent 12 months.

For example SwiftStack, the base subscription allows you to manage up to 150TB of data on-premises and 10TB of data in a public cloud. So if you have less than 150TB that initially needs to be managed, the base license is sufficient. Most users of object storage products are managing more than 150TB, and as you license more usable capacity the price per TB begins to drop.

At petabyte scale, it is common that the software cost is about 1/3rd the cost of the server hardware it is running on per year. So if you need to manage around a petabyte of data, once you determine your rough server hardware cost, then multiply that by 0.33 and it will give you a rough annual software cost. This does vary based on how much hardware is required (which is discussed in the next section), but the goal of this article is to get you in the right ballpark for cost.

Hardware

The primary hardware component needed to build a software-driven, scale-out storage system is a standard Intel-based server containing high-capacity hard disk drives. Determining how many servers you need to purchase is based on these three design considerations:

  1. Raw data capacity: You’ll need a sufficient number of disk drives to store your data. The good news is that almost all use cases for object storage leverage high-capacity drives, so you’ll benefit from the compelling economics of the disk drive market. For example, if a Dell PowerEdge R740xd server contains twelve 8TB disk drives, one node in the scale-out storage cluster will provide about 96TB of raw capacity.

  2. Data protection scheme: To durably store the data in the cluster, a protection policy needs to be applied, which increases the amount of raw capacity consumed. There are two ways to protect data in an object storage cluster:

    1. The first and most widely used is called erasure coding, which breaks data into fragments and places those fragments across available resources in the cluster. In the event of a hardware failure, data can be recreated from these fragments. A typical erasure coding configuration requires a ratio of 1.5x raw-to-usable capacity—providing highly efficient data protection. For SwiftStack, it is important to note that a cluster needs to be built with at least 6 server nodes to be able to use erasure coding in production.

    2. If only 3, 4, or 5 servers are required to get going, then replicas can be used to protect the data. This is where whole copies of the data are replicated and stored across server nodes in the cluster. Replicas require more raw disk space, with 3x raw-to-usable being the most common factor.

  3. Number of sites: One of object storage’s main claims to fame is its ability to stitch together a geographically-dispersed pool of data under a common namespace. This provides fast access to data for applications and users across the enterprise, along with built-in disaster recovery (DR) even when there is a site outage. No matter how many sites the storage cluster is distributed across, the user-defined policies determine where the data is placed. It is common that a whole copy of the data lives in each site, so if the policy states to place the data in two sites, 2x the amount of hardware is needed.

The calculation for how much hardware is required looks like this:

How to Calculate Object Storage Hardware Costs

Note: This is a representative 1PB example. Servers with additional drive bays or the use of larger capacity drives can reduce the required number of servers.

Server pricing is easy to calculate on vendor websites and, at volume (like petabyte scale), discounts off list price are common.

Networking

For any scale-out distributed system, networking is a key component of the solution since data fluidly moves between nodes in the cluster. There are three components of the network that can affect the cost of object storage:

  1. Client-facing network: This is the network that applications and users use to write data to and read data from the storage system. Often, object storage uses the existing datacenter network and, if free ports are available, no significant additional cost is realized.

  2. Cluster-facing network: This is the network that the storage system uses to move data between nodes in the cluster and should be a dedicated network. Often this is achieved using a layer 2 network built of top-of-rack 10GbE switches.

  3. Load balancing: One of the key advantages of a scale-out system is that there is not a single point of access, so as the number of nodes in the cluster grows, performance also increases. To achieve this, API requests are load balanced across available nodes. SwiftStack has a built-in software load balancer to achieve this concurrency, but most other object storage solutions require an external load balancer, which often increases cost.

If distributing the cluster across multiple geographically-dispersed sites, the network between the sites is important, but often that is already in place to serve existing applications.

Basic Network Design for Scale-Out Storage

Public Cloud Capacity

Object storage with a hybrid cloud design is becoming more and more common as modern software allows workloads to be optimized across resources in the data center and in public clouds. This category of products is often defined as Hybrid Cloud Storage, because “hybrid” means both private and public infrastructure are used. At a minimum, it involves 1 private data center and 1 public cloud service like Amazon Web Services or Google Cloud Platform. To give you an idea of hybrid cloud traction, about 34% of SwiftStack customers are currently using the 1space capabilities of the platform to move data to and from public cloud storage for uses such as cloud bursting and offsite data protection.Hybrid Cloud Storage: Managing Data Across Data Centers and Public Clouds

If you are going to place data across a private object storage cluster and a public cloud service, you will need to pay for that public cloud capacity. You will also need to license the software, like SwiftStack 1space, to manage the data across these public and private environments. For example, if a SwiftStack license allows you to place 100TB in a public cloud, you will need 100TB of public cloud capacity. Amazon S3 is the most popular public cloud service, and pricing can be determined using their simple monthly calculator. For hybrid cloud storage use cases, there are two primary costs that you should be aware of:

  1. Public Cloud Storage: This is simply the amount of data that will be stored in the public cloud each month. As of the writing of this article, storing 100TB in Amazon S3 costs about $2500 per month.

  2. Data returned: This is often referred to as egress fees, as you are charged for reading your data out of the public cloud. How much data you access from the public cloud is very challenging to calculate, as it depends not just on usage, but application behavior. For example, if your application reads the same data 20 times in a month, you are charged 20 times to access that data.

How Storage Appliances or Software-Defined Storage Impact Cost

The section above walked you through the building blocks of a software-driven storage solution, but there is another option, which is a fully-integrated storage appliance. This is where hardware and software are integrated from the vendor, where you buy hardware plus support/maintenance instead of software and hardware separately. While this traditional approach to procuring data center infrastructure may be simpler and more familiar to you, there are two things that you should keep in mind when deciding on a scale-out storage solution:

  1. Is it tuned for your environment? Vendors who sell their products in appliance form typically have 1 to 3 options to choose from. This means that you have to select the model that meets and exceeds the requirements of your application. Often this can result in the oversizing of hardware for the initial deployment and when you have to expand, which can add additional and unexpected cost to the solution. With a software-driven solution, you are not locked in to any specific hardware design. This means that the optimal hardware can be chosen for the initial deployment and different hardware can be used when you need to further scale out.

  2. Are you paying for more than you need? When software and hardware is blended into a single price, it gives vendors a way to sell hardware at a premium. This can result in you paying significantly more for hardware than if you purchased software and hardware separately. For example, in an appliance, disk drives could cost 1.5x more than if you purchased them in a Dell or Supermicro server. Doing some basic calculations and getting few software-only quotes can identify significant hardware costs that can be cut out.

Hidden Costs Associated with Open Source?

Another option in the modern, software-defined world is using an open source solution. This is where you download free software and run it on standard hardware. In this approach, you are responsible for final productization of the solution, as well as full support and maintenance over the life of the environment. At SwiftStack, we are huge proponents of open source and continuously contribute upstream to community projects, but when it comes to putting enterprise storage into production, we believe that a commercial solution is best for most businesses and organizations. This is mostly because operating budget in an enterprise is often best spent higher up the stack and not down in the infrastructure layer developing, productizing, deploying, and managing storage functionality.

SwiftStack and RedHat Ceph are good examples of commercial vs open source options, and both products are heavily based on open source functionality. The core object storage runtime in SwiftStack is OpenStack Swift, and the RedHat Ceph is based on the open Ceph project. With the commercial offerings, you get a fully productized version of the technology, plus addition commercial functionality, plus support and maintenance. Often, the overall cost of a commercial solution is significantly less than an open solution deployed in production, even for the largest of organizations.

There are other things to think of such as estimating operating cost or leveraging professional services, and we will cover those in future articles.

Conclusion

We hope that the next time you inquire about storage software pricing, you don’t get the run-around from a vendor. Just tell them you’ve seen a different, more open approach, and answering the question shouldn’t be that hard. If you have any questions or would like to continue the conversation, please feel free to reach out or request a custom quote for SwiftStack software.

Learn about a customized storage solution for your environment from a SwiftStack solution expert.

Request a Custom Quote >“></a></span><script charset=

About Author

Erik Pounds

Erik Pounds

Erik is an avid technology geek, attacks opportunities by building things, and currently leads the marketing function at SwiftStack. Prior to SwiftStack, he led the Sync team at BitTorrent, ran product management at Drobo, and held various product and marketing roles at Brocade and EMC. He proudly graduated from the University of San Francisco, where he captained their Division 1 Golf Team.