HDS just announced it’s completed the acquisition of analytics platform manufacturer Pentaho, a move that’s been in the works for several months. As expected from an established player in the storage space, HDS has been focused on Big Data Analytics, leveraging the breadth of expertise, access and resources of parent Hitachi Ltd. By combining the Pentaho software platform with their hardware and corporate infrastructure, HDS hopes to make complex analytics appealing to a wider range of companies.
According to HDS, Pentaho will remain intact, with the same executive team, same engineering, etc. and operate as a separate company within the Hitachi organization, reporting into Kevin Eggleston, the SVP of Social Innovation. Social Innovation was the theme of the last two HDS analyst events, including HDS Connect in Las Vegas in April.
Who is Pentaho?
Big data analytics involves combining and comparing disparate data sets to create new insights for business, research, etc. Data science is used to figure out what those components should be and how they’re compared, Pentaho provides the platform to make the analysis work. They simplify the preparation and blending of data, including tools to visualize, explore, report and predict the results. According to the company they help “translate data into value”. Pentaho’s solution includes data integration as well as analysis and visualization, with tools to access, transform and manage data sources that feed business and big data analytics.
The Big Data Science Project
For many companies a big data initiative means a science project. And most of these projects involve unstructured data, often enormous amounts of it. Besides just storing that much data, part of the challenge is filtering and processing the component data sets so that they can be used in the analysis. Then there’s the data handling realities of working the results back into the IT infrastructure.
Setting up Hadoop or another analytics platform is different enough from typical IT systems that these infrastructures are often run separately, using different file systems, etc. This reduces efficiency and increases cost, but also makes it harder to use the results that come from those advanced analytics, because they’re typically stored in physically different infrastructures that aren’t integrated with other systems. This is one of the problems that HDS is trying to solve with their acquisition of Pentaho, integration of complex analytics systems into mainstream IT environments.
A Combined Solution
Pentaho provides the platform on which to run analytics projects. And HDS supplies the hardware stack, plus the support systems, the channel, the resources and anything else needed to keep a smaller company or Fortune 100 enterprise up and running. The Pentaho platform will be integrated into HDS’s advanced analytics foundation software, but will still be available independently as well.
One of the things Pentaho provides are what they call ‘blueprints’, reference architectures that help customers learn how to deploy an analytics solution and keep it from becoming a science project. These aren’t just configuration guides, but best-practice solutions the company has developed from real-world use cases and is applying in other use cases or industries. This ‘cross pollination’, the reuse of successful methods or processes, is similar to the concepts driving Hitachi’s Social Innovation, as one division of Hitachi Ltd leverages its success with another.
Big Data projects do often turn into science projects, as IT organizations struggle to operate systems that seem more at home in a university graduate lab than an enterprise data center. One part of the challenge is unstructured data storage and data handling, the other part is setting up and running the analytics platform, whether it’s Hadoop, OpenStack, an in-memory database application, NoSQL, etc.
HDS has been addressing the unstructured data part of the problem for the past several years. Now, with Pentaho’s technology, they can offer a complete solution that’s designed to handle the challenges of big data analytics, from hardware to software. This doesn’t mean Hadoop is now a plug and play application that can be run by a junior admin. It’s still a complex project, but one that HDS hopes to move out of the back room and into the data center.