It seems like backup vendors forgot about protecting NoSQL environments. Environments like Cassandra, MongoDB, Hortonworks, Couchbase, and Hadoop all need point-in-time protection. One reason for the lack of data protection solutions for these environments is that there is an assumption on the part of many, that these databases protect themselves. Another reason is legacy data protection solutions don’t have an architecture that is well suited to the massive data sets and eventually consistent data model that many NoSQL environments have.
NoSQL Can’t Backup Itself
Most NoSQL environments provide a replication model, where a copy of a data set is replicated two or three times to other nodes in the NoSQL cluster. This type of protection provides excellent recovery from situations like loss of media within a node, loss of an entire node and even loss of an entire data center. Replication, however, does not protect from data corruption caused by ransomware, user error or malicious user activity. Replication also does not provide any copy data management functionality for test/dev, cloud migrations or data archiving.
The NoSQL Data Management Challenge
NoSQL data sets are massive as is the compute required to respond to inquiries into that data set. As a result, most NoSQL environments are scale-out. Once the organization understands the importance of establishing a separate standalone backup of these environments, they not only have to find a solution specifically designed to protect them they have to figure out when is the best time to execute the backup process. Many of these environments are 24×7 in nature so finding “quiet time” for backup is very difficult. Another challenge is the “quiet time” can change based on usage, making finding the “quiet time” all the more difficult.
The NoSQL Ransomware Challenge
Because much of their data tends to be unstructured, NoSQL databases are prime targets for a ransomware attack. Point-in-time backups are an ideal recovery point but identifying that an attack is underway and stopping it is also critical. Additionally, since many ransomware programs now attack slowly to avoid detection, realizing that an attack is occurring may be difficult.
Ransomware attacks are more difficult to identify in NoSQL because of the number of files NoSQL typically manages compared to a legacy database which only has a few files. When ransomware attacks traditional databases, the impact is almost immediately noticeable. A NoSQL environment may run for weeks before detection occurs, which means the malware and corruption are infecting multiple, if not all backup copies.
Imanis 4.0 with SmartPolicies and ThreatSense
Imanis Data is a scale-out data protection solution purpose-built for NoSQL. Currently, they protect Cassandra, MongoDB, Hortonworks, Couchbase, Vertica and many others. The solution, however, goes beyond just protection; it also provides an orchestration of data to help with copy data management and automation to help with ransomware protection and RPO/RTO (Recovery Point Objective/Recovery Time Objective) optimization. The 4.0 release focuses on these last two categories.
Imanis Data’s 4.0 release provides organizations with SmartPolicies. These policies enable autonomous RPO-based backup powered by the software’s own machine learning capabilities. It monitors NoSQL cluster workloads and utilization to determine the backup frequency and how to prioritize resources. The combination enables organizations to meet RPOs better. The software also continuously monitors the environment to see if schedules need to adjust to maintain RPO adherence. It also provides guidance on the additional resources required to continue to meet desired RPOs in the future. Imanis Data 4.0 enables IT to set the desired RPO and makes the software determine the best time to perform the backup and how often to protect the environment.
Imanis Data’s ThreatSense analyzes protected data for anomalies that might indicate a ransomware attack or abnormal user behavior. It can detect an abnormal number of file changes over a specified period, which indicates a potential ransomware attack. It can also detect a large deletion of data which indicates a potential rogue user.
In 4.0, the feature is improved to provide users with the ability to provide feedback on ThreatSense reported anomalies. The feature then “learns” from the human input and updates the anomaly model.
The 4.0 release also adds “any point-in-time recovery” to all of the Imanis Data protected environments. Some environments don’t have a point-in-time recovery capability at all, and in others, the capability is difficult to use. Imanis Data makes it GUI driven. With this capability, IT can recover a NoSQL environment to any specified point-in-time without having to learn how to perform the function for each specific environment.
As NoSQL environments become mainstays in production and become more critical to the organization, protecting them must be more of an operational function instead of a NoSQL administrative function. Imanis Data’s 4.0 release provides the organization with that specific capability. IT operations can protect these environments as part of their standard data protection workflows and only have to learn one application to protect the various NoSQL environments the organization may have.