Bala Venkatrao is the director of product management at Cloudera.
I had the pleasure of attending Enzee Universe 2011 User Conference this week (June 20-22) in Boston. The conference was very well organized and was attended by well over 1000+ attendees, many of whom lead the Data Warehouse/Data Management functions for their companies. This was Netezzas largest conference so far in seven years. Netezza is known for enterprise data warehousing, and in fact, they pioneered the concept of the data warehouse appliance. Netezza is a success story: since its founding in 2000, Netezza has seen a steady growth in customers and revenues and last year (2010), IBM acquired Netezza for a whopping $1.7B.
Cloudera announced a partnership with Netezza last year and since then the two companies have been working closely to build a high-speed bi-directional connector between Netezza DW appliances and Apache Hadoop. We launched the general availability of this connector at this week at Enzee Universe 2011. You can download the connector here: https://ccp.cloudera.com/display/SUPPORT/Downloads. Thank you to the teams at Netezza and Cloudera for making this happen. Its been a great collaboration!
Before heading to Netezzas conference, I had a lingering question in my mind: As we enter the era of the petabyte, do EDWs and Hadoop co-exist or compete? In this post Ill share with you the conclusions at which I arrived by the end of the event.
The event kicked off with opening remarks by Jim Baum, CEO of Netezza. The talk mostly focused on how Netezza continues to thrive as a division of IBM and how the synergies are good for customers. Jim also invited Arvind Krishna, GM of IBM’s Information Management Software Division, who emphasized that one size does not fit all. This was a recurring theme throughout the conference. He also talked about how EDWs and NoSQL technologies like Hadoop are appropriate to address the problems they are each most suited to solve.
The personal highlight about Jims keynote was when Cloudera was invited on stage as one of the Analytic Innovators who are helping customers realize more value from their Netezza investments. We are among the 40 innovators from a pool of 300+ partners to be selected for this honor.
After the keynote, we spent the rest of the evening at the Enzee Galaxy Partner Pavilion, where we had an opportunity to speak to several end users about Cloudera and the power of Hadoop. It was interesting to see how the conversations ranged from What is Hadoop? to I know I need Hadoop. I just need to identify the problem! The audience could not have been more relevant. Most of the folks were responsible for driving or making decisions around data architecture / data management and had significant investments in EDW technologies, especially Netezza. Many of them clearly saw Hadoop and Cloudera as greatly complementary to Netezzas technologies and requested introductions to our sales folks to explore joint opportunities.
The day kicked off with a keynote by Steve Mills, IBMs SVP and Group Executive. Steve talked about how 80% of the worlds data is unstructured and about the emergence of NoSQL technologies like Hadoop. An interesting insight from Steve was how he would like IBM to provide more choices to customers. So while integration remains important for IBM, equally important is modularity and the ability to interoperate across different technologies from different vendors. He also talked about the Smarter Planet theme and how Big Data and analytics will make that a reality. Several projects are underway at IBM to bring this concept (Smarter Cities, Smarter Water and Smarter Energy) to reality and all of them depend on harnessing and extracting insights from Big Data.
Throughout the day, there were several tracks focused on Big Data and analytics. Clouderas work on the Cloudera-Netezza Connector was mentioned on several occasions. A most interesting talk (titled Hadoop and IBM Netezza: Co-existence or Competition was given by Krishna Parasuraman, Chief Architect and CTO of Digital Media at Netezza. Krishnan did an excellent job introducing Hadoop to the audience and. perhaps more importantly, comparing and contrasting it with the EDW technologies like Netezza. Krishna clearly felt that these technologies are complementary and illustrated several use cases for how customers could leverage Netezza, Hadoop and Cloudera to derive even more insights than possible before. Clearly the big takeaway was that Hadoop and EDWs should co-exist, not compete.
IBM Netezza had worked with Cloudera to put together a compelling demo to highlight the value of our combined solution of CDH/Hadoop and Netezza. Through an interesting use case, the demo showed how businesses could have their hot data (most recent data) residing in Netezza, warm data (longer time range data) residing in HDFS, while leveraging the Cloudera Connector for Netezza and Oozie (workflow engine part of CDH) to provide deeper insights to business executives. The demo was well received at the conference and resulted in significant booth traffic in the IBM Netezza Analytics demo stations.
The day concluded with an event at the House of Blues Boston with live music and dinner. The Boston weather was at its very best!
An excellent talk by Krishna on Day 2 about demystifying Hadoop drove more traffic to our booth the next day, and we got to meet some really neat prospects.
Highlights of the morning included a presentation on the Netezza Roadmap, which showed Netezza embracing the logical data warehouse vision promoted by the likes of Donald Feinberg at Gartner. This was followed by a presentation by IDC Analyst Dan Vesset. Dan defined Big Data as having the following characteristics:
- Volume: TBs to PBs
- Variety: Multi-structured
- Velocity: Speed of capture, analysis and access
- Value: ROI
Dan emphasized at least five times in his presentation that one size does not fit all and how technologies like Hadoop will solve a different class of analytics problems that were not possible to solves before. One example he cited was how companies could use sentiment analysis of Twitter feeds to understand customer behavior.
After spending three days at Enzee Universe, I came to these conclusions:
- One size does not fit all: EDWs and Hadoop will co-exist in enterprise environments
- The combination of analytics enabled through EDWs (and appliances) and those through Hadoop, enable customers to gather richer and newer insights that were previously not possible to gain
- There was strong interest in Hadoop among attendees but still, many are looking for tangible use cases that will drive adoption.
Overall, it was a great conference by Netezza. Kudos to the team! Im looking forward to continuing to work with Netezza to realize the synergies between EDWs and Clouderas Distribution Including Apache Hadoop (CDH).