When private cloud object storage vendors first started appearing on the market, many of them were missing many core features. Some did not have any concept of enterprise access control, encryption or versioning, while others did not have multi-region support. In a world where customers are becoming increasingly interested in self protecting storage, releasing new storage products that don’t have such features seems odd.
Yet vendors continued to release and customers continued to buy object storage products that were missing key features. Besides data protection features, another glaring feature shortage from many products was any concept of Key Performance Indicators (KPIs) and capacity monitoring and planning. Any mature storage product should be able to provide these capabilities, as evidenced by many other non-object products. Yet, until now those features were unavailable in major object storage offerings.
Scality Cloud Monitor
Scality recently announced its new Scality Cloud Monitor, which it describes as a turnkey monitoring solution. It has two levels of support: Standard and Dedicated Care Services (DCS). The standard features are available to anyone with a support contract, and the DCS version is available to those with an existing DCS contract. It is not a separately licensed product; it is simply included in your service contract.
The Standard version monitors 15 metrics that look at overall system health, such as capacity planning and detection of significant issues that would cause a major outage. These metrics include disk storage capacity, global RING health (RING is what Scality calls their object storage cluster), global ring configuration and any problems with collecting the metrics themselves.
The DCS version monitors over 100 metrics and associated KPIs, and looks at all aspects of the health of a Scality RING. In addition to the metrics included in the Standard version, Scality adds alarms on the capabilities of individual disks, nodes, and connectors. It includes alarms and alerting of problems discovered during a health check or anything that would cause the unavailability of services. It also provides a Service Monitor, which emulates an end user to simulate the end-to-end availability of the system.
If a customer opts for the DCS version – which is included as part of their DCS contract – Scality offers a 100 percent availability guarantee. The Scality RING design allows it to provide storage services without any downtime for planned maintenance, where it dynamically rebalances resources after a server, disk or data center failure. The new Scality Cloud Monitor product is like an insurance layer that assures no unplanned downtime by continuously monitoring the environment to make sure it is running optimally. Scality mentioned how a very large telecommunications vendor has reported zero downtime since it installed Scality in 2010, while growing to 25 times their original capacity.
Scality described the use of artificial intelligence and machine learning to notice when things are different than they used to be. For example, it mentioned the product noticed that one customer’s Physdeletes (i.e. actually deleting unused objects, AKA garbage collection) had dropped from 4000 deletes per second to zero. While this didn’t create an immediate problem, it did indicate that something was different. In this case, the customer discovered upon investigation that they had deactivated this feature due to planned maintenance and had forgotten to re-enable it. The Scality Cloud Monitor product discovered this change and reported the problem before it actually became a problem. (Had they not resolved the problem it would have led to an out-of-space condition, which may have caused downtime.)
It is very encouraging the companies like Scality are moving up the maturity model and beginning to offer real-time performance and capacity monitoring to their customers. Customers who want to use their object storage for something more than a bit bucket will want the system to have similar availability to traditional storage. Continuous monitoring with a human feedback loop is the only way to ensure that.