This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate.
If I asked a question about the benefits in optimizing Enterprise Data Warehouse (EDW) with Apache Hadoop, from my own experience, 9 out of 10 responses had to do with either data archiving or the reservation of high-performance EDW processing capability. Ultimately the conversation led to cost-savings measures: lower storage, computational, and software licensing costs, the “tangible” benefits that are true and dear to IT organizations. However, many people seem to overlook the “intangible” impact from EDW optimization, which I am about to explore and elaborate in this article: the cultural change that transforms how businesses operate today.
Traditional EDW is mainly used for generating reports and answering pre-defined queries, where workloads and requirements for service level are static. With that said, schemas are typically predetermined and purpose-built. If a certain type of data can’t be determined in the schema, such as unstructured and semi-structured data, users can’t effectively store or use the data for analysis in EDW. This inflexibility results in rigid data models that can’t facilitate ad hoc or exploratory analysis. Constrained by this limitation, users cannot freely explore and ask questions from their data to enable timely responses and insights that drive the speed of business required to stay competitive today.
With the supplement of Hadoop, it unchains the shackles of predetermined schema with the introduction of schema on read. The alternative approach allows users to store data first, organize later by defining the schema at the time of access. It further enhances the core functionality of Hadoop as a landing zone to accommodate raw and all data from any sources: structured, unstructured, and semi-structured, and gives users the flexibility and reusability over the consumption of data. Moreover, with data enrichment in Hadoop, new types of data are processed, transformed, and fed into EDW to unlock new analytic value. Hadoop allows an ideal data architecture to ingest, store, and refine virtually any type of data for analysis in Hadoop, the EDW or any other analytical systems.
In a business context, the building of EDW on Hadoop provides tremendous benefits to organizations, so significant to a degree that causes a cultural shift in which businesses operate. Multiple lines of business (LOB) can access the data landing zone for their own copy of data, eliminating the gruesome work of data modeling in response to different LOB requests under a predetermined one-size-fits-all approach of serving up data. Rather than a “reactive” way of insight extrapolation by preparing data for analysis for a specific task or objective, companies can now employ a “proactive” mechanism to gain timely, comprehensive and reliable insights through unrestricted data exploration, ad hoc queries, and analytics. This change is well exemplified at Centrica, the world’s leading energy and services company and home of brands such as British Gas, Direct Energy and Bord Gáis Energy, after implementing Hortonworks Data Platform (HDP) for data-at-rest and Hortonworks DataFlow (HDF) for data-in-motion, to power its data analytics and simplify the estate of its IT portfolio.
Other than the “tangible” benefits of saving millions of dollars annually from their EDW, Centrica renovates their old IT systems to energize a new data driven business model that transforms their business in the following LOB:
Business Operations & Finance – By aggregating the data and analyzing it, the team has been able to accurately provide smart energy reports. As a result, customers have been able to gain a better understanding of their energy usage, by looking at consumption peaks and time of the day, with an accurate representation of how their money is spent. Considering the energy industry has always relied on estimates to operate, the introduction of smart metering based on data analysis has reshaped the way Centrica has been able to monitor energy usage and issue accurate bills, rather than rely on estimates. Data can be easily collected, sorted and analyzed every 30 minutes for the most reliable and accurate reporting
Customer Service – The data lake has enabled British Gas to draw various correlations between millions of customer records, in the form of complaints, enquiries, billing, equipment installed and even the number of times the equipment was flagged as faulty. Furthermore, it has reshaped the way customer service is handled across millions of homes throughout the country by drawing a profile for every single customer. The Centrica team has been able to better monitor customer satisfaction and the way complaints are handled. By using Hortonworks’ analytics capability, the team has been able to develop, identify and rectify issues before customers were aware of them.
Field Engineering – Centrica has created additional sets of applications based on data analytics in order to improve the way engineers work and interact with their day-to-day activities. For example, one of the apps allows engineers to interact with the database and flag any potential inaccuracies or issues, such as capturing feedback on wrong equipment. This has completely changed the way British Gas engineers are interacting with customers, while allowing a two-way communication stream that tracks general satisfaction and aids data accuracy. Moreover, advancements have been achieved in health and safety, insurance and quality control.
As the line between technology and business blurs, the new generation of business advancements comes from a mastery of data. In today’s world where data and information are at the core of enterprises, learn more about how Hortonworks solutions can help transform your business:
- Solution Brief: Hortonworks Enterprise Data Warehouse Optimization
- Complete Case Study: Centrica
- White Paper: Hortonworks Data Platform (HDP)
- Solution Sheet: Hortonworks DataFlow (HDF)