The Challenges of Large Splunk Datasets

Posted on April 24, 2019 by George Crump

Splunk enables organizations to analyze data that IT operations, security and the various lines of business create. The more of this data that is available for analysis the better decisions IT can make. Ideally organizations want their Splunk environment data to be endless, retaining all information forever. While the Splunk environment can maintain and quickly analyze data no matter how large the data set, very large Splunk datasets create significant challenges that often force an organization to scale down their environment and not extract the full potential from it.

The Cost of Large Splunk Data Sets

In order to deliver acceptable search performance most organizations attempt to keep all their data on flash or at least high speed hard disk drives (HDD). Both of these storage mediums are expensive compared to alternatives like high capacity disk. Another less obvious cost factor is the way the Splunk infrastructure is designed. Most of these architectures are built from a group of servers, clustered together and each server becomes a node. Each node has internal storage, compute and networking.

When more processing power is required, each node comes with additional processing power, storage capacity and network bandwidth. The problem is that these resources are not used at the same pace. Most Splunk expansions are caused by lack of capacity not lack of compute. As a result, when the Splunk cluster scales, its resource utilization becomes out of balance and one of its most expensive resources, CPU, is massively underutilized. Also, the more nodes that are added the more expensive the networking infrastructure that supports it becomes. It consumes more and more ports and requires more and more switches, which further adds to the expense of the environment.

The Complexity of Large Splunk Datasets

The rapid expansion of nodes to meet capacity demands creates a more complex environment. Administrators need to make sure that data protection is running properly across all the nodes in the cluster. Data protection increases the cost considerably because the expanding capacity used by the data protection algorithm is storing protected copies on the same node and storage class as the primary copy of data. IT planners need to take note that the demand to keep data for a longer and longer period of time has an exponential impact on the cost of the environment.

The expansion of nodes also means additional network complexity. While the Splunk compute environment does expand by “just adding a node” the addition of that node means fitting it into an already stuffed data center. Sometimes adding a node means moving other equipment and even forcing the retirement of some. Identifying and configuring network cabling and connections also becomes increasingly complicated as nodes are added.

The Impact of Large Splunk Environments

Users of the Splunk environment want to expand usage to include more data sets and they want to increase the retention of those data sets, potentially for years if not decades. The cost and complexity of delivering on these expectations is a significant roadblock for IT but for the true value of Splunk to be realized it must be overcome.

What IT Needs

IT needs a way to strategically distribute Splunk datasets across platforms that are best suited for where that data is in its lifecycle. Ideally, the primary Splunk tier requires relatively very small capacity in relation to the secondary tier. A small hot tier also means that organizations can afford to make that tier all-flash based to speed the time to answers. Fortunately, Splunk has delivered an important first step, with their SmartStore architecture, which was introduced in Splunk Enterprise 7.2. And companies like SwiftStack are supporting it so that the Splunk environment can efficiently scale. Indexer nodes now scale independently from capacity nodes.

Our next blog takes a deeper dive on Splunk SmartStore and how its architecture enables a more disaggregated scale-out architecture where performance and capacity can scale independently of each other. In the meantime, register for our on demand webinar with SwiftStack “Rearchitecting Storage for the Next Wave of Splunk Data Growth” and receive our latest eBook “Doing More With Splunk Data…For Less”.

Watch On Demand

Sign up for our Newsletter. Get updates on our latest articles and webinars, plus EXCLUSIVE subscriber only content.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: AI, All-Flash, Analytics, Artificial Intelligence, Cloud, Flash, HDD, Hybrid, Machine Learning, Metadata, ML, NoSQL, Splunk, SwiftStack
Posted in Blog