Cloudera Developer Blog · General Posts
Today is a big day: Cloudera is not only urging our customers to “Unaccept the Status Quo” (the continued and accelerating spending on data warehousing, expensive data storage, and associated software licenses), but we also announced that Cloudera Search has entered public beta. Now anyone who knows how to do a Google search can query data stored in Cloudera’s Platform for Big Data.
In this post, however, I’d like to explain the new, simpler product naming/packaging structure that will make adopting and deploying Cloudera more straightforward.
Introducing Cloudera Standard
From now on, in addition to CDH, our 100% open source distribution of Apache Hadoop and related projects that is always available to whoever wants to try it, we will offer customers two options that also include Cloudera Manager, our management automation software:
One of the unexpected pleasures of open source development is the way that technologies adapt and evolve for uses you never originally anticipated.
Seven years ago, Apache Hadoop sprang from a project based on Apache Lucene, aiming to solve a search problem: how to scalably store and index the internet. Today, it’s my pleasure to announce Cloudera Search, which uses Lucene (among other things) to make search solve a Hadoop problem: how to let non-technical users interactively explore and analyze data in Hadoop.
Cloudera Search is released to public beta, as of today. (See a demo here; get installation instructions here.) Powered by Apache Solr 4.3, Cloudera Search allows hundreds of users to search petabytes of Hadoop data interactively.
As we march toward HBaseCon 2013 (June 13 in San Francisco), it’s time to bring you a preview of the Internals track (see the Operations track preview here) — the track guaranteed to be of most interest to Apache HBase developers and other people tracking the progress of the code base.
This track, hosted by Salesforce.com’s Lars Hofhansl (also an HBase PMC Member and HBaseCon keynote speaker), focuses on the architecture, features, and development of HBase. You will learn about interesting features, best practices for using them in production/business-critical environments, and how development is done by the community.
I’m pleased to announce that CDH 4.3 is released and available for download. This is the third quarterly update to our GA shipping CDH 4 line and the 17th significant release of our 100% open source Apache Hadoop distribution.
CDH 4.3 is primarily focused on maintenance. There are more than 400 bug fixes included in this release across the components of the CDH stack. This represents a great step forward in quality, security, and performance.
There are also a few new features in this release. One new feature is the ability of HDFS to rebalance within a datanode. This is a great (configurable) way to help prevent drive failure and maintain performance without having to run more disruptive cluster-wide rebalances. Hue has also received a number of new features, including a Pig editor and support for using the HDFS trash bin.
Mark your calendars, all you data cyclists!
I’m visiting Paris, London, and Edinburgh this June. When I travel I like to talk to locals. And, wherever I am, I like to bicycle. So, I thought I might combine these interests and host “data rides” in these three cities.
In each city I’ll name a time and a meeting point, and then ride the local roads for an hour or two with whomever shows up. Afterward, we might need some libations at a local pub. I might even get Cloudera to throw in some schwag.
The schedule/agenda grid for HBaseCon 2013 (rapidly approaching: June 13 in San Francisco) is a thing of beauty.
If you lacked motivation to register up until this point, we think that this session line-up will convince you otherwise. We repeat: whether you’re an HBase committer or just getting started (or at any level in between), HBaseCon is simply an event that you can’t afford to miss – and with an entry fee of just $350, it’s also one you can easily afford.
The post below was originally published at blogs.apache.org/hbase. We re-publish it here for your convenience.
Apache HBase is a distributed big data store modeled after Google’s Bigtable paper. As with all distributed systems, knowing what’s happening at a given time can help spot problems before they arise, debug on-going issues, evaluate new usage patterns, and provide insight into capacity planning.
Since October 2008, version 0.19.0 (HBASE-625), HBase has been using Apache Hadoop’s metrics system to export metrics to JMX, Ganglia, and other metrics sinks. As the code base grew, more and more metrics were added by different developers. New features got metrics. When users needed more data on issues, they added more metrics. These new metrics were not always consistently named, and some were not well documented.
“Are data warehouses becoming victims of their own success?”, Tony Baer asks in a recent blog post:
Editor’s Note (Dec. 11, 2013): As of Dec. 2013, the Cloudera Development Kit is now known as the Kite SDK. Links below are updated accordingly.
At Cloudera, we have the privilege of helping thousands of developers learn Apache Hadoop, as well as build and deploy systems and applications on top of Hadoop. While we (and many of you) believe that platform is fast becoming a staple system in the data center, we’re also acutely aware of its complexities. In fact, this is the entire motivation behind Cloudera Manager: to make the Hadoop platform easy for operations staff to deploy and manage.
So, we’ve made Hadoop much easier to “consume” for admins and other operators — but what about for developers, whether working for ISVs, SIs, or users? Until now, they’ve largely been on their own.
It’s time for me to give you a quarterly update (here’s the one for Q1) about where to find tech talks by Cloudera employees in 2013. Committers, contributors, and other engineers will travel to meetups and conferences near and far to do their part in the community to make Apache Hadoop a household word!
(Remember, we’re always ready to assist your meetup by providing speakers, sponsorships, and schwag.)
A couple highlights: