Last week, several Cloudera employees attended the Bay Area HBase User Group #9, kindly hosted by Mozilla at their headquarters in Mountain View. About 80 people attended, and it was a great chance to get together with the whole HBase community. I got a chance to chat with some community members who have been running HBase in their organizations for quite some time, and also several who are just beginning to investigate the project for new and exciting projects within their businesses.
The user group organizers were kind enough to invite me to present, and I took the opportunity to discuss the integration between HBase and HDFS (the Hadoop Distributed File System). HBase utilizes HDFS for all of its underlying storage, and therefore understanding the performance and reliability characteristics of HDFS is key to a deep understanding of HBase.
In my presentation, I talked about some of the original design goals of HDFS for batch processing, and enumerated some of the exciting new developments currently under way that will really improve it for online use cases like HBase. I also announced that Cloudera will be including these important patches in CDH3 after development and testing are finished; we’re committed to being the very best Hadoop and HBase distribution out there.
Check out the slides above, and if you’re considering using HBase in your business, please feel free to get in touch with me at firstname.lastname@example.org. And if you’re in the area, be sure to register now for HBase User Group #10.