Cloudera Engineering Blog · General Posts
We announced a leadership change at Cloudera today. Tom Reilly, formerly CEO at Arcsight, is joining us in my old role – CEO – and I am assuming two new posts: Chief Strategy Officer and Chairman of the Board of Directors.
When we started the company five years ago, almost no one had heard of Apache Hadoop. Big Data, to the extent the term was used at all, was strictly a consumer internet phenomenon. No other enterprise vendor believed the platform mattered.
For those of you who missed the show, session video and presentation slides (as well as photos) will be available via hbasecon.com in a few weeks. (To be notified, follow @cloudera or @ClouderaEng.) Although it’s not quite as good as being there with the rest of the community, you’ll still be able to partake from the real-world experiences of Apache HBase users like Facebook, Box, Yahoo!, Salesforce.com, Pinterest, Twitter, Groupon, and more.
Today is a big day: Cloudera is not only urging our customers to “Unaccept the Status Quo” (the continued and accelerating spending on data warehousing, expensive data storage, and associated software licenses), but we also announced that Cloudera Search has entered public beta. Now anyone who knows how to do a Google search can query data stored in Cloudera’s Platform for Big Data.
In this post, however, I’d like to explain the new, simpler product naming/packaging structure that will make adopting and deploying Cloudera more straightforward.
Introducing Cloudera Standard
One of the unexpected pleasures of open source development is the way that technologies adapt and evolve for uses you never originally anticipated.
Seven years ago, Apache Hadoop sprang from a project based on Apache Lucene, aiming to solve a search problem: how to scalably store and index the internet. Today, it’s my pleasure to announce Cloudera Search, which uses Lucene (among other things) to make search solve a Hadoop problem: how to let non-technical users interactively explore and analyze data in Hadoop.
As we march toward HBaseCon 2013 (June 13 in San Francisco), it’s time to bring you a preview of the Internals track (see the Operations track preview here) — the track guaranteed to be of most interest to Apache HBase developers and other people tracking the progress of the code base.
I’m pleased to announce that CDH 4.3 is released and available for download. This is the third quarterly update to our GA shipping CDH 4 line and the 17th significant release of our 100% open source Apache Hadoop distribution.
CDH 4.3 is primarily focused on maintenance. There are more than 400 bug fixes included in this release across the components of the CDH stack. This represents a great step forward in quality, security, and performance.
Mark your calendars, all you data cyclists!
I’m visiting Paris, London, and Edinburgh this June. When I travel I like to talk to locals. And, wherever I am, I like to bicycle. So, I thought I might combine these interests and host “data rides” in these three cities.
The schedule/agenda grid for HBaseCon 2013 (rapidly approaching: June 13 in San Francisco) is a thing of beauty.
The post below was originally published at blogs.apache.org/hbase. We re-publish it here for your convenience.
Apache HBase is a distributed big data store modeled after Google’s Bigtable paper. As with all distributed systems, knowing what’s happening at a given time can help spot problems before they arise, debug on-going issues, evaluate new usage patterns, and provide insight into capacity planning.
“Are data warehouses becoming victims of their own success?”, Tony Baer asks in a recent blog post: