As organizations come to grips with new regulations like the European Union’s General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act (CCPA), their need to develop a more robust data management strategy is obvious. Organizations can no longer keep expanding primary storage systems with the assumption that they will keep all data forever. To solve this problem, they try to create a data management strategy which attempts to “manage it all,” but proves to be both complex and expensive.
The “manage it all” approach is a direct offshoot of the “keep it all” storage architecture. The organization has limited insight into the data they are storing, its value to the organization, and its relevance to the various regulations. Not all data needs managing since a large percentage of data has no value to the organization and doesn’t pertain to the various rules. The reason most organizations don’t manage only the data they are required to is they don’t know which data that is. Finding the specific data that does need managing is like finding a needle in a haystack. Another challenge is most organizations have their data distributed across many storage systems and several sites. Most data management solutions can’t manage data across multiple sites and through the various types of storage.
Index Engines – The Power to Know What You Have
To classify data it must be indexed, and the indexing process must span storage systems across the organization’s locations and into the cloud. The indexing process must also be fast, so it doesn’t become part of the problem or take so long to update that the index is almost always out of date.
Index Engines, as the name implies, is built from the ground up to rapidly scan data to create an index based on file attributes and information inside the file. Index Engine claims their indexers can scan and process up to 1TB of file data per hour and indexes that are less than 5% of the data set. The organization can install multiple indexers to scale the scanning time to meet business needs. Most importantly the Index Engines solution can also scan across various storage systems, into backup data, and across cloud resources, making it possible to create a universal index and repository for all of the organization’s data.
Index Engines started as a solution for backup tape and catalog management. Despite it not being a best practice, many organizations use their backup applications for long-term data retention. Index Engines delivers direct access to backup data as well as catalog management. It is essentially the archive overlay for backup data. It is also ideal for organizations migrating between backup applications, eliminating the need to maintain a single server of older applications. Index Engines can even perform ad-hoc restoration of data to support Legal and Compliance requests.
Its beginnings as an indexer of backup data laid the groundwork for future use cases like indexing of primary data stores, and cloud stores. The solution now provides end-to-end indexing of the organization’s data footprint. The organization, thanks to Index Engines, finally understands what it is storing.
Index Engines leverage is petabyte class indexing technology to organize data based on value. Data uses comprehensive reporting and analysis to identify data that is redundant, obsolete and trivial (ROT Analysis). This analysis identifies a large percentage of an organization’s data set that is out of scope for both organizational retention and legal compliance. This ROT data can either be deleted or stored on very inexpensive media like tape.
After classification, Index Engines isolates in-scope data. It indexes within files to find data that is sensitive and contains personal information. Index Engines uses keyword, pattern and conceptual search to generate a highly sensitive data set.
Another critical capability of the Index Engines solution is to take action on the data it manages. The solution can migrate, archive or delete content based on organizational policies. The software can also proactively audit sensitive material to ensure employees accessing data should have those permissions.
In the past, IT typically viewed data management as a complicated way to save money. Evaluators often determined that the cost savings wasn’t worth the complexity. As a result, they continued to expand primary storage. Today, although a robust data management solution will still save the organization money, it is also critical to maintain compliance with regulations like GDPR and CCPA. Index Engines is expanding its capabilities with Machine Learning and advanced analytics to detect cyber attacks like ransomware or unauthorized access to files.
Index Engines’ advantage is the speed with which it can index combined with its powerful analytics. While other solutions can move data and save organizations money, Index Engines brings in an intelligence layer that enables the organization to remain in compliance, protect itself from ransomware and extract more value than ever from its digital assets.