With HBaseCon 2013 (Early Bird registration now open!) preparations in full swing, you may be interested in learning a bit about the personalities behind the Program Committee, who are tasked with formulating a compelling, community-focused agenda.
Recently I had a chance to ask committee members Gary Helmling (Twitter), Lars Hofhansl (Salesforce.com), Jon Hsieh (Cloudera), Doug Meil (Explorys), Andrew Purtell (Intel), Enis Söztutar (Hortonworks), Michael Stack (Cloudera), and Liyin Tang (Facebook) a few questions:
How did you get involved in the HBase community?
Hsieh: While working at Cloudera and being involved in the Hadoop ecosystem on other projects in 2009, I met a bunch of the folks in the Apache HBase community. When the opportunity came up to work on HBase and help take it to the next level, I jumped at it!
Meil: I also got involved with HBase in 2009. Explorys was looking for a back-end datastore that would scale for our aggressive data storage and processing needs cost-effectively. By integrating into the Hadoop stack, we could leverage HDFS as well as MapReduce.
Söztutar: I was working on a social aggregator, also in 2009, and having been involved in the Hadoop community made it a no-brainer to go with HBase as the data storage layer. I was contributing to it by 2011.
Purtell: At a former employer in 2008, we were looking at an exponential trend in the volume of data we would need to process daily for both production and research tasks. For the latter especially flexibility in schema management was a big plus. Excellent Hadoop integration meant we could set up a full analytics pipeline. My involvement increased over time as we came to depend on HBase more. The great thing about basing your infrastructure on an open source foundation is you both can grow by working together.
Stack: I was working at the Internet Archive on their crawler (Heritrix) and on search using early versions of Hadoop when the Google Bigtable paper came out (in 2006). I soon learned that Powerset, a San Francisco natural-language search engine startup, wanted to build an open source Bigtable clone. I went to work for them to help get what eventually became HBase off the ground.
Helmling: In 2009, I was working on a project at Meetup and looking for a storage system that would provide transparent scalability with a flexible schema. While doing some prototyping, I came across HBase, which seemed to fit the bill nicely. I contributed a couple bug fixes back to the HBase community, and I’ve been working on HBase ever since.
Hofhansl: In 2010 it became clear that Salesforce could no grow the amount of data we can store on our traditional, relational storage. So we set out to find alternative, scalable stores. After vetting most existing NoSQL/KeyValue stores we settled on HBase, and it is being rolled out to production today.
What unique perspectives do you think you bring to the PC?
Hsieh: As I mentioned previously, I’ve worked on several other early Hadoop related projects. Furthermore, since I’m at Cloudera, I get to see a wide variety of customers’ HBase deployments and apps in production — and encounter a wide variety of challenges from supporting them.
Söztutar: Like Jon, I have been involved in Hadoop and other eco-system projects for a long time, and am a part of a larger group working on HBase and Hadoop development full time.
Tang: I have worked on HBase development for more than two years at Facebook, and as you might expect, have some experience about how to build a reliable service on top of HBase.
Meil: Similar to Liyin, I have real-life experience with HBase and I think like a user (smile). I’ve done a lot of work technical documentation with HBase and on the dist-lists and I understand the questions that people have when getting started with HBase.
Helmling: I think I still retain some end user perspective on how changes we make in HBase can be applied in applications, and I have a strong interest in real world usage. I have also been around long enough to have some perspective on how HBase has evolved to where it is.
Purtell: Some of my background has been in the enterprise. I have brought in some concerns from that space and thought about what might make sense for HBase: replication, security, RESTful integration. Some of my background has also been in research, so I see the value in HBase being a flexible platform (via coprocessors) for solving challenges creatively.
Hofhansl: I have been involved in the database (relational, object oriented) and non-database (industrial software such as oil-refinery automation) world for over 24 years. So I think I also bring a broad perspective to this project.
Why do you think HBaseCon is a positive thing for the community?
Hofhansl: HBaseCon brings folks together, and gives them a chance to learn what is out there and to collaborate and make connections.
Hsieh: As Lars said, there are folks from around the world working on this project, so it’s a great opportunity to meet them and our users in person. It’s also an opportunity for folks to tell their “war stories” and share their hard-earned wisdom with others.
Helmling: It’s great to bring together so many parts of the community under one roof, from people successfully running applications, to those just starting to look at HBase, to those creating new features in HBase itself. There is always a tremendous exchange of information, both in the sessions and in the hallways.
Purtell: I echo what the others have said. HBaseCon brings developers and users of HBase together in a large-scale way that isn’t otherwise possible. I’ve been around HBase for a long time, but I was still surprised by the interesting and unexpected details of many talks at HBaseCon 2012.
Söztutar: As others point out, meeting people face-to-face and discussing how they use HBase in their organization is really valuable.
What were some of your favorite things about HBaseCon 2012?
Meil: I was at the first Hadoop World in 2009 and there was exactly one HBase presentation that day. Last year it was great to see how interest had grown to support an entire conference! There are so many different use cases now.
Stack: In 2012 there was a really good vibe that came of having all the HBase brothers and sisters together under the one roof, as well as the girth, range, and quality of the talks. There was everything from Karthik on the nosebleed scale at Facebook to a personal favorite: OCLC moving WorldCat, a 40-year-old library services project that has 25k libraries from 170 countries participating, to HBase.
Hofhansl: For me, HBaseCon was great for meeting people I had previously only known via email. I also learned a lot about what other people do with HBase!
Helmling: I really enjoyed talking to people building applications on HBase about the ways they are applying it, as well as talking to other committers about the problems they are trying to solve. Even as someone who actively follows the community discussions, I came away with a new understanding of where HBase is going. There is a depth to the face-to-face discussions that you just can’t get on the mailing lists.
Tang: I agree; I particularly enjoyed the talk about HBase schema last year, and I also had a chance to share some ideas, lessons, and experience of my own.
Hsieh: The HBase community is a technical bunch and a great example of a good open source community. I really like the technical dev-centric and ops-centric focus of the talks, war stories, and conversations. There were relatively few pure market-ecture or sales pitches, and I hope it stays that way.
Why should people attend HBaseCon 2013?
Tang: For users, HBaseCon is a very good opportunity to learn about use cases of HBase and also get familiar with HBase internals and features. For HBase developers, it is a good time to share the insights into the feature developments and future roadmaps.
Hofhansl: Yes, to be even more emphatic about it, HBaseCon is the best opportunity you will have to get to know the committers and other users, and learn about what is possible with HBase.
Hsieh: Furthermore, the number and variety of contributors and users of HBase has grown significantly in the past year. There are new stories, new applications, and a whole set of new systems being built on top of HBase. Lots of what Stack calls “good stuff.”
Purtell: My view is that if you are curious about HBase, come and see what it is all about. If you are a user of HBase, come to learn from the experience of others and perhaps share your own. If you are an IT professional, entrepreneur, researcher, or student, come see what HBase can do for you today and what it will be capable of in the near future.
Stack: If HBaseCon 2013 is even a quarter as good as the 2012 version, it will be a great day.
Helmling: What Stack said. This will be the biggest and best HBaseCon yet!
Early Bird registration closes on April 23, so take advantage of discounted pricing now!