Cloudera Blog · HBase Posts
The schedule/agenda grid for HBaseCon 2013 (rapidly approaching: June 13 in San Francisco) is a thing of beauty.
If you lacked motivation to register up until this point, we think that this session line-up will convince you otherwise. We repeat: whether you’re an HBase committer or just getting started (or at any level in between), HBaseCon is simply an event that you can’t afford to miss – and with an entry fee of just $350, it’s also one you can easily afford.
HBaseCon 2013 is approaching fast – June 13 in San Francisco. If you’re on the fence about attending – or perhaps your manager is on the fence about approving your participation – here are a few things that you/they need to know (in no particular order):
- HBaseCon is the annual rallying point for the HBase community. If you’ve ever had a desire to learn how to get involved in the community as a contributor, or just want to ask a committer or PMC member why things are done (or not done) a certain way, this is your opportunity – because this is where those people are. Participating in a mailing list thread is never quite the same once you’ve met the people behind it.
- HBaseCon is a one-stop shop for learning about the HBase roadmap, as well as other projects across the ecosystem. Current HBase users should be particularly interested in learning about which JIRAs will have the most impact on the user experience – and once again, most of the committers working on those JIRAs will either be leading sessions or otherwise present. Plus, you can learn about how new complementary projects like Impala, Kiji, Phoenix, and Honeycomb are transforming the use cases for HBase and helping to expand its footprint across the enterprise.
- HBaseCon is a feast of real-world experiences and use cases. Sure, maybe you’ve read about the HBase-backed applications used by companies like Facebook, Salesforce.com, eBay, Pinterest, and Yahoo!. But wouldn’t it be helpful to hear technical details and best practices directly from the people who built and run them? I’ll bet it would. And you really can’t do that anywhere else — in the whole world. (Plus, you can take advantage of formal training right before the conference, at a discount.)
- HBaseCon is a pageant of engineer rock-stars. If your company is an HBase user and hungry for talent, there’s no better place to find it: HBaseCon is literally the world’s biggest gathering of HBase experts under one roof.
- HBaseCon is a heck of a blast. Come for the deep-dives and advice, stay for the after-event party. The libations will be extensive!
If you have any interest in HBase whatsoever, whether as a user or prospective user, missing HBaseCon is almost unthinkable.
The post below was originally published at blogs.apache.org/hbase. We re-publish it here for your convenience.
Apache HBase is a distributed big data store modeled after Google’s Bigtable paper. As with all distributed systems, knowing what’s happening at a given time can help spot problems before they arise, debug on-going issues, evaluate new usage patterns, and provide insight into capacity planning.
Since October 2008, version 0.19.0 (HBASE-625), HBase has been using Apache Hadoop’s metrics system to export metrics to JMX, Ganglia, and other metrics sinks. As the code base grew, more and more metrics were added by different developers. New features got metrics. When users needed more data on issues, they added more metrics. These new metrics were not always consistently named, and some were not well documented.
This post was originally published via blogs.apache.org, we republish it here in a slightly modified form for your convenience:
At first glance, the Apache HBase architecture appears to follow a master/slave model where the master receives all the requests but the real work is done by the slaves. This is not actually the case, and in this article I will describe what tasks are in fact handled by the master and the slaves.
Regions and Region Servers
HBase is the Hadoop storage manager that provides low-latency random reads and writes on top of HDFS, and it can handle petabytes of data. One of the interesting capabilities in HBase is auto-sharding, which simply means that tables are dynamically distributed by the system when they become too large.
HBaseCon (hosted by Cloudera), now in its second year, is THE community event for Apache HBase contributors, developers, admins, and users. There is no better place to dive head-first into use cases, best practices, internals, and futures as well as to meet the rest of the community.
This how-to is the second in a series that explores the use of the Apache HBase REST interface. Part 1 covered HBase REST fundamentals, some Python caveats, and table administration. Part 2 below will show you how to insert multiple rows at once using XML and JSON. The full code samples can be found on GitHub.
Adding Rows With XML
The REST interface would be useless without the ability to add and update row values. The interface gives us this ability with the
POST verb. By posting new rows, we can add new rows or update existing rows using the same row key.
First, let’s step through how to do this using the XML and JSON data formats. Let’s start with XML.
It’s time for me to give you a quarterly update (here’s the one for Q1) about where to find tech talks by Cloudera employees in 2013. Committers, contributors, and other engineers will travel to meetups and conferences near and far to do their part in the community to make Apache Hadoop a household word!
(Remember, we’re always ready to assist your meetup by providing speakers, sponsorships, and schwag.)
A couple highlights:
With HBaseCon 2013 (Early Bird registration now open!) preparations in full swing, you may be interested in learning a bit about the personalities behind the Program Committee, who are tasked with formulating a compelling, community-focused agenda.
Recently I had a chance to ask committee members Gary Helmling (Twitter), Lars Hofhansl (Salesforce.com), Jon Hsieh (Cloudera), Doug Meil (Explorys), Andrew Purtell (Intel), Enis Söztutar (Hortonworks), Michael Stack (Cloudera), and Liyin Tang (Facebook) a few questions:
How did you get involved in the HBase community?
The following FAQ is provided by James Taylor of Salesforce, which recently open-sourced its Phoenix client-embedded JDBC driver for low-latency queries over HBase. Thanks, James!
What is this new Phoenix thing I’ve been hearing about?
Phoenix is an open source SQL skin for HBase. You use the standard JDBC APIs instead of the regular HBase client APIs to create tables, insert data, and query your HBase data.
Doesn’t putting an extra layer between my application and HBase just slow things down?
Actually, no. Phoenix achieves as good or likely better performance than if you hand-coded it yourself (not to mention with a heck of a lot less code) by:
Hadoop Summit Europe is coming up in Amsterdam next week, so this is an appropriate time to make you aware of the Cloudera speaker program there (all three talks on Thursday, March 21):