When we announced Cloudera’s Distribution for Apache Hadoop last month, we asked the community to give us feedback on what features they liked best and what new development was most important to them. Almost immediately, Debian and Ubuntu packages for Hadoop emerged as the most popular request. A lot of customers prefer Debian derivatives over Red Hat, and installing RPMs on top of Debian, while possible with tools like alien,
Today I did a web search for “pig training” using my favorite search engine. I was wildly entertained by the results, and have embedded my favorite for your viewing pleasure.
However, when I stopped laughing, I realized that this probably isn’t what most people reading this blog would have hoped to find. To that end, I am happy to announce that Cloudera’s Online Apache Hadoop Training now includes two sessions on Apache Pig.
One of the repeating themes we have heard while working with our customers and the community is that Apache Hadoop configuration and deployment is a pain. Often times, Hadoop is the first truly distributed system that administrators encounter, and the problem is made worse by the lack of standardized packages and deployment tools. And some releases are buggy. And upgrades are hard. And the list goes on.
In order for Hadoop to truly disrupt the enterprise,
It’s a new year, the time when we take a moment to look back at the previous one, and forward to what might be coming next. In the world of Hadoop a lot happened in 2008.
At the beginning of the year, Hadoop was a sub-project of Lucene. In January, Hadoop became a Top Level Project at Apache, in recognition of its success and diversity of community. This allowed sub-projects to be added,
We’re happy to announce a new tool we have been developing here at Cloudera: Hadoop Development Status. Hadoop Development Status aims to help the Hadoop community understand its direction, health, and participants. The project currently monitors the most active contributors according to mailing list traffic, the most watched JIRA tickets, and aggregate traffic volumes on the Hadoop mailing lists.
The graph of messages per month on the Hadoop Core lists shows a sustained growth in traffic.