Splunk’s recent blog describing Splunk Enterprise 7.2 with Smart Store states that the new data management model “makes it incredibly cost efficient to achieve longer data retention at scale”. As a provider of storage which is compatible with Splunk SmartStore, SwiftStack is being asked by Splunk users to help quantify these cost savings. What does “incredibly cost efficient” mean in practice? Fortunately, the simple yet powerful design of SmartStore makes answering this question a relatively straightforward exercise.
Splunk SmartStore and how SwiftStack fits
First, let’s review the fundamentals of the solution design. SmartStore separates the compute and storage tiers in a Splunk Enterprise deployment, allowing each tier to be tailored to the specific environment and workload. Under the hood, SmartStore’s caching algorithms keep the data most likely to be searched (ie, hot) in the fastest storage media while placing all other data (ie, warm) on denser, lower cost drives, with data moved back and forth based on operational patterns. If the traditional search timeframe is 30 days, then the SmartStore hot tier would be sized to meet this requirement. Once data is rolled into the warm tier, the bucket metadata is left in the hot tier to enable Splunk Enterprise to determine which buckets are needed when performing a search. The warm buckets are uploaded back into the hot tier and the search results are provided. This right-sizing of compute combined with the optimized placement of data-on-storage forms the basis of SmartStore’s cost savings equation.
SwiftStack’s storage software manages the Splunk warm data volumes, which commonly comprise 90%+ of the total data amount. SwiftStack attaches to Splunk SmartStore through the S3 API, provides the failure resiliency mechanisms to keep Splunk data safe, and scales non-disruptively as Splunk data sets expand. And, to keep the infrastructure consistent and easy to manage, SwiftStack software can run on the same server platform that’s used for the Splunk indexers.
Splunk SmartStore and SwiftStack change the cost equation
Now, let’s look at the economics of a representative Splunk Enterprise shop with a 1TB per day ingest rate and a 365 day retention period (other inputs: replication factor = 2, and compression ratio = 50%). We’ll use a before-and-after graphic to illustrate the impact SmartStore has on the underlying hardware assets. Remember, both configurations are addressing the same set of requirements and will deliver comparable search performance.
- In a “classic” Splunk Enterprise deployment (ie, pre-SmartStore), the infrastructure on the left side of the graphic would be required. That’s 31 indexer servers chock full of SSD drives. The compute and storage functions are tightly coupled, limiting flexibility in how scaling occurs.
- The right side of the graphic shows the infrastructure for Splunk Enterprise 7.2 with SmartStore. The indexer count is reduced to 8 and a thin layer of SSD drives is only required to create a cache sized for 1 week’s worth of ingest (hot). SwiftStack software running on mid-range servers with dense 10TB or 12 TB hard drives stores the warm data buckets, which will exceed 90% of all Splunk data in this example.Now, with SmartStore, as Splunk usage grows, the compute and storage tiers (built with SwiftStack) can be scaled independently.
What’s in it for IT?
In addition to compelling hard-dollar cost savings, the separation of compute and storage provides operational advantages. Now that the warm data resides on SwiftStack, indexers can be added to the Splunk Enterprise cluster in minutes instead of hours. The indexer only needs to get the metadata for the warm bucket instead of the full bucket and once it has synchronized metadata, it can begin to participate in the cluster. The time it takes to remove an indexer is also reduced because now it only has to roll the small hot buckets to warm and then it can be removed.
The usage of metadata allows for a quick and efficient way to deploy, expand, rebuild and manage Splunk Enterprise architectures. The net effect of the release of SmartStore is a significant reduction in hardware and administration costs for almost any Splunk Enterprise customer.
It’s all about the data
Every organization wants to do more with its data, to gain business insights and create competitive advantages. Splunk is the analytics engine that makes these outcomes possible. And, as this analysis demonstrates, the cost and complexity of the infrastructure is no longer an inhibitor to realizing all the value Splunk can deliver.
Eric Rife is the Director of Solutions Architects at SwiftStack.