Data Monetization is the art of extracting value from an organization’s existing data assets. For example, monetary value of data can come from creating a new video product from an old one, or can come from the analysis of years worth of sensor data to find better ways to build products or deliver services. The problem is that in many cases the process used to derive value from the data asset is not the same one as the process creating the data. Also, sometimes these analysis need to span across multiple data sets from unrelated sources. Data collection becomes critical to extracting value.
Variety is the Spice of Data Monetization
An organization can capture sources from a variety of processes and devices. Data can feed in from sensors across connecting to the data center through the Internet. Also known as the Internet of things (IoT) the use of sensors is becoming increasingly commonplace for organizations of all types and sizes. In addition, data can come from video feeds or it can simply come from existing databases or users.
The point is that there is a deep variety of sources and each of these sources use different protocols in order to transfer and store the data they capture and create. But the organization can’t afford to stand up a new storage system for every single type of device or even a storage system for each protocol. Not only would a multi-device approach be expensive it would also be much more difficult for the analysis processes to retrieve data based on certain queries.
A Data Repository
What this needs instead is a multi-protocol data repository that can present its storage volumes through the protocols that these devices expect. But just supporting the protocols is not enough though. The data still needs storage in, ideally, a single volume that is accessible by other protocols for analysis. A single multi-protocol system that stores all data into a single volume and enables that data to be accessible from any other protocol makes for an ideal point of access for analysis by applications like Hadoop, Splunk and Spark.
More Than A Dumping Ground
At the same time this repository needs to be more than just a data dumping ground. Storing all the data an organization and its IoT devices creates is useless if IT can’t find that data in the future. The repository needs to provide intelligent tagging and search to find the data quickly users need it. For example, as each IoT device sends data to the repository, it can tag that data with the name and type of device. In the future if analysis of that device type is necessary, the analysis application does not need to scan the entire data repository to find the right devices. With tagging, the data can go directly to the application. In fact, some repositories have the ability to create a smart-folder of sorts that automatically presents a folder based on certain tagged criteria.
A centralized data repository is vital to the success of a data monetization project. Making sure all device, database and user data can be centrally stored and later examined enables the organization to work with a complete picture as it plans future products, services or creates entirely new data assets. Vendors have used terms like Data Lake and Data Ocean to describe these data repositories. But IT professionals need to be careful. Not all lakes (or oceans) are created equal. Some can’t provide cross-protocol volume presentation and others don’t have built-in metadata search capability. All of these attributes are required to ensure maximum return on the data collection investment.
Sponsored by Caringo