Cloudera Developer Blog · HBase Posts
This is the week of Apache HBase, with HBaseCon 2013 taking place Thursday, followed by WibiData’s KijiCon on Friday. In the many conversations I’ve had with Cloudera customers over the past 18 months, I’ve noticed a trend: Those that run HBase stand out. They tend to represent a group of very sophisticated Hadoop users that are accomplishing impressive things with Big Data. They deploy HBase because they require random, real-time read/write access to the data in Hadoop. Hadoop is a core component of their data management infrastructures, and these users rely on the latest and greatest components of the Hadoop stack to satisfy their mission-critical data needs.
Today I’d like to shine a spotlight on one innovative company that is putting top engineering talent (and HBase) to work, helping to save the planet — literally.
That company is Opower. Opower partners with 80+ utilities providers to offer an integratedcustomer engagement platform using a software-as-a-service (SaaS) model. Its goal: to help people save energy and reduce utilities bills by applying intensive, Big Data analytics to deliver informative dashboards, alerts, incentives, similar household comparisons, and other communications to customers across communication channels and via in-home devices. Opower combines utility data — such as that from smart meters — with weather information, geographic details, demographic data and more, over decades of history, so it can offer valuable insights. (Hint: this is where the value of Hadoop and HBase come in.)
HBaseCon 2013 is this Thursday (June 13 in San Francisco), and we can hardly wait!
To complete the “preview” cycle, today we bring you a snapshot of the Case Studies track, which offers a cross-section of the many real-world use cases for Apache HBase. You will learn about how a range of companies across diverse industries use it at the heart of their IT infrastructure to run their business.
Michael Stack is the chair of the Apache HBase PMC and has been a committer and project “caretaker” since 2007. Stack is a Software Engineer at Cloudera.
Apache Hadoop and HBase have quickly become industry standards for storage and analysis of Big Data in the enterprise, yet as adoption spreads, new challenges and opportunities have emerged. Today, there is a large gap — a chasm, a gorge — between the nice application model your Big Data Application builder designed and the raw, byte-based APIs provided by HBase and Hadoop. Many Big Data players have invested a lot of time and energy in bridging this gap. Cloudera, where I work, is developing the Cloudera Development Kit (CDK). Kiji, an open source framework for building Big Data Applications, is another such thriving option. A lot of thought has gone into its design. More importantly, long experience building Big Data Applications on top of Hadoop and HBase has been baked into how it all works.
Kiji provides a model and set of libraries that help you get up and running quickly.
Kiji provides a model and a set of libraries that allow developers to get up and running quickly. Intuitive Java APIs and Kiji’s rich data model allow developers to build business logic and machine learning algorithms without having to worry about bytes, serialization, schema evolution, and lower-level aspects of the system. The Kiji framework is modularized into separate components to support a wide range of usage and encourage clean separation of functionality. Kiji’s main components include KijiSchema, KijiMR, KijiHive, KijiExpress, KijiREST, and KijiScoring. KijiSchema, for example, helps team members collaborate on long-lived Big Data management projects, and does away with common incompatibility issues, and helps developers build more integrated systems across the board. All of these components are available in a single download called a BentoBox.
As you may know, Apache HBase has a vibrant community and gets a lot of contributions from developers worldwide. The collaborative development effort is so active, in fact, that a new point-release comes out about every six weeks (with the current stable branch being 0.94).
At Cloudera, we’re committed to ensuring that CDH, our open source distribution of Apache Hadoop and related projects (including HBase), ships with the results of this steady progress. Thus, CDH 4.2 was rebased on 0.94.2, as compared to its predecessor CDH 4.1, which was based on 0.92.1. CDH 4.3 has moved one step further and is rebased on 0.94.6.1.
Apart from the rebase, CDH 4.3 also has some important bug fixes backported from later versions of 0.94 and trunk. Following are some of the important features and improvements that now ship in CDH 4.2/CDH 4.3:
Unbelievably, HBaseCon 2013 is only one week away (June 13 in San Francisco)!
Today we bring you a preview of the Ecosystems track, a grand tour (in 20-minute increments) of the fascinating current work being done across the community to extend or build on top of Apache HBase.
As we march toward HBaseCon 2013 (June 13 in San Francisco), it’s time to bring you a preview of the Internals track (see the Operations track preview here) — the track guaranteed to be of most interest to Apache HBase developers and other people tracking the progress of the code base.
This track, hosted by Salesforce.com’s Lars Hofhansl (also an HBase PMC Member and HBaseCon keynote speaker), focuses on the architecture, features, and development of HBase. You will learn about interesting features, best practices for using them in production/business-critical environments, and how development is done by the community.
As you have probably learned by now, HBaseCon 2013 sessions are organized into four tracks: Operations, Internals, Ecosystem, and Case Studies. In combination, they offer a 360-degree view of Apache HBase that is invaluable for experts and aspiring experts alike. In the next few posts leading up to the conference (June 13 in San Francisco – register now while there’s still room), we’ll offer sneak previews of what each track has to offer.
First up is the Operations track, which will be hosted by Facebook’s Liyin Tang (HBase PMC Member and HBaseCon keynote speaker):
The schedule/agenda grid for HBaseCon 2013 (rapidly approaching: June 13 in San Francisco) is a thing of beauty.
If you lacked motivation to register up until this point, we think that this session line-up will convince you otherwise. We repeat: whether you’re an HBase committer or just getting started (or at any level in between), HBaseCon is simply an event that you can’t afford to miss – and with an entry fee of just $350, it’s also one you can easily afford.
HBaseCon 2013 is approaching fast – June 13 in San Francisco. If you’re on the fence about attending – or perhaps your manager is on the fence about approving your participation – here are a few things that you/they need to know (in no particular order):
- HBaseCon is the annual rallying point for the HBase community. If you’ve ever had a desire to learn how to get involved in the community as a contributor, or just want to ask a committer or PMC member why things are done (or not done) a certain way, this is your opportunity – because this is where those people are. Participating in a mailing list thread is never quite the same once you’ve met the people behind it.
- HBaseCon is a one-stop shop for learning about the HBase roadmap, as well as other projects across the ecosystem. Current HBase users should be particularly interested in learning about which JIRAs will have the most impact on the user experience – and once again, most of the committers working on those JIRAs will either be leading sessions or otherwise present. Plus, you can learn about how new complementary projects like Impala, Kiji, Phoenix, and Honeycomb are transforming the use cases for HBase and helping to expand its footprint across the enterprise.
- HBaseCon is a feast of real-world experiences and use cases. Sure, maybe you’ve read about the HBase-backed applications used by companies like Facebook, Salesforce.com, eBay, Pinterest, and Yahoo!. But wouldn’t it be helpful to hear technical details and best practices directly from the people who built and run them? I’ll bet it would. And you really can’t do that anywhere else — in the whole world. (Plus, you can take advantage of formal training right before the conference, at a discount.)
- HBaseCon is a pageant of engineer rock-stars. If your company is an HBase user and hungry for talent, there’s no better place to find it: HBaseCon is literally the world’s biggest gathering of HBase experts under one roof.
- HBaseCon is a heck of a blast. Come for the deep-dives and advice, stay for the after-event party. The libations will be extensive!
If you have any interest in HBase whatsoever, whether as a user or prospective user, missing HBaseCon is almost unthinkable.