Three keys to a good data monetization system are collecting the right data, identifying and executing monetization against that data, and storing the data in a cost-effective way. It should go without saying that a data monetization system needs data, and identifying the right data for monetizing and storing it is the first step to a good data monetization system. The second step, and perhaps most challenging, is identifying how to monetize the data. Finally, in order for the data monetization system to actually turn a profit, costs of storing this data must be kept at a minimum.
The difficulty with cost management is data monetization systems require a significant amount of data. Big data – as the term implies – requires big storage. Big storage typically costs a lot to buy and costs a lot to operate. Let’s look at each of these aspects.
In previous posts, we’ve discussed why an object storage system is the most appropriate system for long-term storage of data – especially if you plan to monetize the data. An object storage system is much less expensive to purchase than the alternatives, especially if you consider the incremental cost to your backup system that happens if you purchase traditional storage. Object storage can protect itself both on-site and off-site, without requiring any additions to your backup system.
Ongoing costs for a typical disk-based storage system include power and cooling, operational management, and backup and recovery costs. In our entry “Protecting High File Count Systems“ we discuss how an object storage system does not need backing up using a traditional backup and recovery system, so buying an object storage system will be less expensive in that regard. Such systems also tend to be self managing, without requiring administrators to manually grow and shrink volumes that store data. Administrators typically just need to add additional nodes to a cluster in the system that will just start using them; therefore, the ongoing operational costs of an object storage system should be less than a traditional storage system.
The real ongoing costs for a disk-based storage system is power and cooling. Assuming both systems use compression and duplication, an object storage system and a traditional storage system would seem to use the same amount of power because it needs the same amount of disk. However, some object storage systems support the concept of powering down discs or nodes that have data that is not currently actively accessible. A tape system also does not power data stored on its media, but it would be much harder to monetize data stored on tape, as it is not randomly accessed.
Powering down discs that are not in use brings two significant benefits, and both go to cost. The first is relatively obvious: discs that are powered down don’t have any power and cooling costs. Over time powering down discs storing unused data can have significant power and cooling savings.
The less obvious benefit of powering down unused discs is the coercivity formula (KuV/kT). It determines how much bits on magnetic disk will degrade over long periods of time (unofficially referred to as bit rot). The higher the temperature (k) and the longer the time (T), the more the media will degrade. That means that turning off magnetic media increases the life of the data stored on said media. It should also be said that object storage systems have other systems for detecting and repairing bit rot, but powering down the disks will decrease the amount of bit rot that must be dealt with.
Object storage systems appear to be the perfect place to store data for a data monetization system. They are less expensive to acquire and less expensive to operate than traditional storage systems. In addition, some object storage systems have power down features that both save power and cooling costs and increase the life of the data and media. Many of these same things can be said about a tape-based system, but it would be more difficult to monetize data stored on tape.
Sponsored by Caringo