Cloudera Developer Blog · Training Posts
Around the globe, more and more companies are turning to Hadoop to tackle data processing problems that don’t lend themselves well to traditional systems. Users in the community consistently ask us to offer training in more places and expand our course offerings, and those who have obtained certification have reported great success connecting with companies investing in Hadoop. All of this keeps us pretty excited about the long term prospects for Hadoop.
We recently announced our first international developer training sessions in Tokyo (sold out, waitlist available) and Taiwan, and we’re happy to follow up with sessions in the EU. We’ll be visiting London the first week of June, and Berlin the next. If you’ll be in Berlin that week, be sure to check out the Berlin Buzzwords conference – a two day event focused on Hadoop, Lucene, and NoSQL.
We’ve also put together new offerings for this years upcoming Hadoop Summit, and we’ve worked out a special deal with Yahoo! to waive the conference registration fee for anyone who attends a Cloudera training session at the 2010 Hadoop Summit (you’ll get a discount code for training in your conference registration confirmation). In addition to our developer certification course, we’ll offer an extended version of our Systems Administration course, as well as new, full-day course on HBase. One particularly exciting new offering is our full-day course on Hive, which opens Hadoop up to anyone who knows SQL.
We love getting together with other Hadoop fans and fanatics! We’ve put together new training offerings for this years upcoming Hadoop Summit in June, and we’ve worked out a special deal with Yahoo! to waive the conference registration fee for anyone who attends a Cloudera training session at the 2010 Hadoop Summit (you’ll get a discount code for training in your conference registration confirmation). In addition to our developer certification course, we’ll offer an extended version of our Systems Administration course, as well as new, full-day course on HBase. One particularly exciting new offering is our full-day course on Hive, which opens Hadoop up to anyone who knows SQL.
All of these offerings are driven by direct customer feedback about what their organizations need to be even more successful with Hadoop, and we’re excited to help. We look forward to seeing you there.
It’s been over a year now since we started offering Hadoop training in the Bay Area, and since then, we’ve put many of our introductory materials online (for free), and offer in-person public classes in cities around the US (click here for a full list of sessions). The response has been incredible, but one thing is painfully obvious: we’re not doing enough to meet the needs of the growing world-wide Apache Hadoop community.
To that end, we’ve made investments in translating translating our materials into new languages and thinking about how to scale our training programs internationally.
As a first step, we’ll offer our three-day developer training session outside the US later this spring. We’ll announce cities and dates in the EU soon, but we’re happy to announce our first two sessions in Asia now:
To say we were surprised by the quality and quantity of submissions we received for Hadoop World: NYC 2009 would be an understatement. We were amazed at how many “normal” companies have come to use Hadoop for everything ranging from business intelligence to protein alignment. It’s truly exciting to see how a system originally designed to process and index the web has evolved to support the data-driven workloads of so many industries.
It’s with great joy that we invite you to come learn about what the following companies have done with Hadoop: About.com, Booz Allen Hamilton, China Mobile, ContextWeb, eBay, Facebook, IBM, Intel, JPMC, Microsoft, The New York Times, NexR, Rackspace, Vertica, Visa, Visible Measures, Yale, and Yahoo!
If you have ever wondered what Hadoop might be able to do for you, this is your chance to learn both from leaders in the webspace and within your own industry.
Update (May 1 2013): The post below, which is based on an outdated VM, is deprecated. Rather please see the Cloudera QuickStart VM, which runs on VirtualBox, VMware, and KVM.
Cloudera’s Training VM is one of the most popular resources on our website. It was created with VMware Workstation, and plays nicely with the VMware Player for Windows, Linux, and Mac. But VMware isn’t for everyone. Thomas Lockney has managed to get our VM image running on Virtual Box, and has written a step-by-step guide for the community. Thanks Thomas! – Christophe
I was quite pleased when I discovered that Cloudera had created a virtual machine image that could be used while working through their training material. It would make the process simpler, and it looked like a potentially useful environment for general Hadoop experimentation. However, their VM is built for VMware, which I stopped using a while back. However, as a heavy VirtualBox user, I knew that it would not be hard to get it running in my preferred desktop virtualization environment.
As Apache Hadoop continues to turn heads at startups and big enterprises alike, Cloudera has received several requests to offer certification in addition to our popular training programs.
Certification is a critical component of any software ecosystem, and especially so for open source projects with quickly expanding user bases. Certification allows developers to ensure their skills are up to date, and allows employers and customers to confidently identify individuals that are up for the challenge of solving problems with Hadoop.
To that end, we are happy to announce Cloudera Certification for Hadoop.
Today I did a web search for “pig training” using my favorite search engine. I was wildly entertained by the results, and have embedded my favorite for your viewing pleasure.
However, when I stopped laughing, I realized that this probably isn’t what most people reading this blog would have hoped to find. To that end, I am happy to announce that Cloudera’s Online Apache Hadoop Training now includes two sessions on Apache Pig.
Update (added 5/15/2013): The information below is dated; see this post for current instructions about configuring Eclipse for Hadoop contributions.
One of the perks of using Java is the availability of functional, cross-platform IDEs. I use
vim for my daily editing needs, but when it comes to navigating, debugging, and coding large Java projects, I fire up Eclipse.
Typically, when you’re developing Map-Reduce applications, you simply point Eclipse at the Apache Hadoop
jar file, and you’re good to go. (Cloudera’s Hadoop training VM has a fully-configured example.) However, when you want to dig deeper to explore—and modify—Hadoop’s internals themselves, you’ll want to configure Eclipse to build Hadoop. Because there’s generated code and a complicated
build.xml file, this takes some tinkering. Now that I have the full Hadoop Eclipse experience going (it took me a few tries), I’ve prepared a screencast that will help guide you through it, from downloading Eclipse to debugging one of its unit tests. You’ll also want to reference the EclipseEnvironment Hadoop wiki page, which has more details.