Scaling Agentless SRM

Posted on March 17, 2011 by George Crump

Storage resource monitoring (SRM) technology has evolved from networked systems which communicated with dedicated software installed on the individual host elements, to architectures which largely eliminated these ‘agents’. This “agentless” technology consolidated these dedicated pieces of software into a ‘collector’ and managed the communication between each host and the SRM application itself. As storage infrastructures grow in size and complexity, some agentless technologies may be proving too static to function efficiently in today’s dynamic storage environments. In fact there has even been a resurgence in “agented” architectures. Either the enterprise is going to need to deal with agents or something beyond the currently available architectures will be needed in order to scale these agentless SRM solutions and maintain operability without overwhelming administrators.

Agents and Agentless Architectures

Storage resource monitoring was historically done using a specific piece of code, called an “agent”, which was installed at the root on each host or element in the environment. Agents would run like a daemon or Windows service and execute a set of privileged commands which would generate the pertinent system information and display it, or transport it to an SRM application server. As a piece of software, the agent needed to be updated when the host was updated (usually) and often when other interfacing elements in the environment were changed or updated as well. Potentially the biggest drawbacks to the agent-based architecture were deployment and maintenance. The time required and the disruption caused by installing and updating these agents were often significant and could seem like an almost continuous process in larger environments.

In an “agentless” architecture, the agents are removed from each individual host and run on a central device. This “collector”, which was either resident on the SRM server or run as a separate server, essentially consolidated and ‘virtualized’ the agents that were formerly on each host in the environment. In place of a proprietary piece of code (the agent), they would use standard protocols like SSH or WMI to gather information. The collector would execute commands over a network connection instead of inside the host, removing some of the agent’s overhead, parse the data it gathers and transport it to the SRM server.

Permutations and the Technology Stack

Independent monitoring tools have to maintain compatibility with each component or element in the environment, including the changes and updates that regularly occur within each device. But the interaction between elements can also affect how well a monitoring system functions. The technology stack in the modern enterprise is pretty deep and includes a range of hardware elements, including servers, network cards, switches, HBAs and storage controllers. It also includes software elements like applications, operating systems, file systems, volume management software, multi-pathing software, etc. When these hardware and software elements are added up a formidable number of combinations, or permutations can result.

The collector carries out several tasks including authentication, network connection, execution of the commands themselves and parsing the different data streams that come from each element. All the components in the technology stack that the collector needs to interface with together drive a specific set of commands and rules required by the collector software. When changes occur within these components the collector must be updated as well. For example, mapping storage from an array to the host requires an awareness of a number of components, from the storage controller and HBAs, to the switches and the host HBA, then a half dozen software components on the host, including the application. Each change in firmware anywhere along the line can drive another update of the collector as well.

Updating these versions on the collector can generate a new change-control event for the environment. So, while the agentless architecture has simplified some aspects of the update process, it still requires a lot of IT administrators’ time to keep the system current. As mentioned above, the permutations of versions and updates with the elements in this technology stack create a velocity of change that can overwhelm even an agentless architecture. What’s needed is an architecture that can keep up with the constant changes in the environment and do so without disrupting day-to-day operations.

Beyond Agentless Architectures

The collector architecture itself provides potential improvement over software agents, but needs to be modified in order to move beyond the limitations of the agentless model in use by most vendors today. New technologies from companies like APTARE have removed the data parsing and interpretation logic from the collector software and stored this dynamic component as a rules definitions file which can be updated without triggering an update to the collector itself. These definitions files are stored in a web-hosted database and accessed by a portal which can automatically update the collector as needed, when changes are made to elements in the environment. Similar to the way antivirus software updates are delivered, rules definitions are pushed out to each collector through its associated web portal, automatically. No collector firmware revisions are installed, no change-control events are triggered and updates can take place as often as needed, without administrative intervention.

This centralized store of definitions files is shared by all client organizations, enabling the system to compile a very large number of element combinations – the permutations of the technology stack mentioned earlier. The result is a set of pertinent configurations that’s made available to each client organization, and enables it to benefit from the collective ‘knowledge’ of multiple environments. To exploit this shared knowledge base, the collector software has the ability to adapt and try other possible definitions files when changes in the environment warrant this. By leveraging some attributes of expert systems design, these new internet-enabled agentless architectures can ‘learn’ which available data combine to provide the best definitions, instead of simply applying a value for each device from a static table.

The impact for IT administrators is considerable. Instead of playing catch-up trying to keep a ‘frigid’ SRM architecture current, administrators can let this new agentless architecture update itself completely independent from the underlying environment that it is monitoring. And leveraging component information, system wide, enhances its ability to operate autonomously.

In addition to less administrative overhead another benefit of this architecture is potentially more accurate data from monitoring applications. Due to the workload common in short-staffed IT organizations, the manual collector updates described above won’t always get implemented as often as they should be. So, like a car that’s out of tune, the monitoring information that’s presented may be incomplete or less than 100% accurate, and getting worse as each day passes. By essentially maintaining its own updates, SRM platforms running this new architecture can help provide assurance to IT managers that information about their infrastructures is accurate and complete.

Storage environments are becoming increasingly complex and traditional infrastructure monitoring applications are having a hard time keeping up. From agent-based to agent-less architectures, SRM manufacturers are continually innovating to keep their systems easy to use and accurate. Each time an infrastructure changes it drives a change in the command and interface information used by the monitoring application, which can trigger a disruptive change control event. A new internet-enabled, agentless architecture, like APTARE’s StorageConsole is based on technologies similar to those used by the anti-virus software industry and may provide a way for SRM platforms to stay abreast of the dynamic environments they’re relied on to monitor.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Agentless, Aptare, HBA, SRM
Posted in Article

Scaling Agentless SRM

Agents and Agentless Architectures

Permutations and the Technology Stack

Beyond Agentless Architectures

Share this:

Related