Cloudera Developer Blog · Training Posts
This guest post is provided by Rohit Menon, Product Support and Development Specialist at Subex.
I am a software developer in Denver and have been working with C#, Java, and Ruby on Rails for the past six years. Writing code is a big part of my life, so I constantly keep an eye out for new advances, developments, and opportunities in the field, particularly those that promise to have a significant impact on software engineering and the industries that rely on it.
In my current role working on revenue assurance products in the telecom space for Subex, I have regularly heard from customers that their data is growing at tremendous rates and becoming increasingly difficulty to process, often forcing them to portion out data into small, more manageable subsets. The more I heard about this problem, the more I realized that the current approach is not a solution, but an opportunity, since companies could clearly benefit from more affordable and flexible ways to store data. Better query capability on larger data sets at any given time also seemed key to derive the rich, valuable information that helps drive business. Ultimately, I was hoping to find a platform on which my customers could process all their data whenever they needed to. As I delved into this Big Data problem of managing and analyzing at mega-scale, it did not take long before I discovered Apache Hadoop.
Mission: Hands-On Hadoop
My initial reading about Hadoop on the various blogs and forums had me convinced that it is easily one of the best tools out there for handling and processing large volumes of data. At first, I thought I’d be able to learn Hadoop on my own by reading Hadoop: The Definitive Guide and the Hadoop Tutorial from Yahoo! However, after only a few days of reading, it became clear that I would benefit greatly from direct interaction with Hadoop experts, supervised experimentation, and interaction with practical examples of Hadoop challenges from the field.
In this installment of “Meet the Instructor,” we speak to San Francisco-based Glynn Durham, one of the big brains behind Cloudera’s Introduction to Data Science training and certification.
What is your role at Cloudera?
I am a Senior Instructor with Cloudera University, which means I am a road warrior: I will travel anywhere to teach anything to anyone. I teach all the courses Cloudera offers, including custom private training events that I run at customer sites. Right now, I’m especially enjoying teaching Cloudera’s new course, Introduction to Data Science: Building Recommender Systems. In tandem with the rollout of the course, we’re developing Cloudera Certified Professional: Data Scientist exams, which will include a challenging performance-based lab component in addition to the written test.
Prior to Cloudera, I primarily came from a database background. My first corporate job was at Oracle just before it went public. I spent a year producing Oracle’s first batch of course materials for developers and database administrators and then spent several years teaching all kinds of people all over the world. For some time, I was an Oracle Database Administrator. I eventually moved on to the LAMP code stack, and I later worked for MySQL.
Introduction: Training is Key
Apache Hadoop is extremely important to maximizing the value Syncsort’s technology delivers to our customers. That value promise starts with a solid foundation of knowledge and skills among key technical staff across the company.
We chose Cloudera University’s private training option to ensure Syncsort’s cross-functional team of engineering, support, services, and technical sales professionals had the expertise to optimize our data products for the end-user. Because the members of our team had different levels of prior Hadoop experience, the private class enabled us to freely share information and ask tough questions, resulting in a high level of engagement throughout the course.
Are you new to Apache Hadoop and need to start processing data fast and effectively? Have you been playing with CDH and are ready to move on to development supporting a technical or business use case? Are you prepared to unlock the full potential of all your data by building and deploying powerful Hadoop-based applications?
Cloudera University is the world leader in Apache Hadoop training and certification. Our full suite of live courses and online materials is the best resource to get started with your Hadoop cluster in development or advance it towards production. We offer deep industry insight into the skills and expertise required to establish yourself as a leading Developer or Administrator managing and processing Big Data in this fast-growing field.
But did you know Cloudera training can also help you plan for the advanced stages and progress of your Hadoop cluster? In addition to core training for Developers and Administrators, we also offer the best (and, in some cases, only) opportunity to get up to speed on lifecycle projects within the Hadoop ecosystem in a classroom setting. Cloudera University’s course offerings go beyond the basics to include Training for Apache HBase, Training for Apache Hive and Pig, and Introduction to Data Science: Building Recommender Systems. Depending on your Big Data agenda, Cloudera training can help you increase the accessibility and queryability of your data, push your data performance towards real-time, conduct business-critical analyses using familiar scripting languages, build new applications and customer-facing products, and conduct data experiments to improve your overall productivity and profitability.
For a limited time, Cloudera University is offering a 15% discount when you register for two or more Hadoop training courses to help you build out and realize your Big Data plan. Cover the basics with Developer or Administrator training, move beyond the HDFS and MapReduce core by pairing Developer and HBase training, work towards machine learning with Hive and Pig training and Introduction to Data Science, or customize your own learning path. Just use discount code 15off2 when you register for multiple public training classes from Cloudera University. This offer is only available for new enrollments and is only valid for classes delivered by Cloudera and scheduled to begin before March 1, 2013.
The Hadoop Community is an invariably fascinating world. After all, as Clouderan ATM put it in a past blog post, the user group meetups are adorably called “HUGs.” Just as the Cloudera blog has introduced you to some of the engineers, projects, and applications that serve as the head, heart, and hands of the Hadoop Community, we’re proud to add the circulatory system (to extend the metaphor), made up of Cloudera’s expert trainers and curriculum developers who bring Hadoop to new practitioners around the world every week.
Welcome to the first installment of our “Meet the Instructor” series, in which we briefly introduce you to some of the individuals endeavoring to teach Hadoop far and wide. Today, we speak to Jesse Anderson (@jessetanderson)!
What is your role at Cloudera?
I joined Cloudera about a year ago as a curriculum developer and instructor. I get the best of both worlds in educational services: I create and improve existing curriculum, such as the Cloudera Manager series, and I travel to teach the courses.
Start the year off with bigger questions by taking advantage of Cloudera University’s special offer for aspiring Hadoop administrators. All participants who complete a Cloudera Administrator Training for Apache Hadoop public course by the end of March 2013 will receive a free digital copy of Hadoop Operations by Eric Sammer. If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must. In addition to providing practical guidance from an expert, Hadoop Operations is also a terrific companion reference to the full Cloudera Administrator course.
Cloudera’s three-day course provides administrators a comprehensive understanding of all the steps necessary to operate and manage Hadoop clusters. From installation and configuration through load balancing and tuning your cluster, Cloudera’s administration course has you covered. This course is appropriate for system administrators and others who will be setting up or maintaining a Hadoop cluster. Basic Linux experience is a prerequisite, but prior knowledge of Hadoop is not required.
Upon completion of the course, attendees also receive a voucher for a Cloudera Certified Administrator for Apache Hadoop (CCAH) exam. Certification is a great differentiator; it helps establish individuals as leaders in their field, providing customers with tangible evidence of skills and expertise.
We are very pleased to introduce new, CDH4.1-aligned versions of the Cloudera Certified Developer for Apache Hadoop and Cloudera Certified Administrator for Apache Hadoop exams.
To celebrate, we’re offering a steep 40% discount on the new exams until the end of the year! Just use the promotion code CDH4 when you register to take the CCD-410 or CCA-410 exam through Pearson VUE before Dec. 31, 2012.
Promotion Code: CDH4
Expiration Date: 12/31/2012
Data science has been a ubiquitous topic of conversation in the IT and business worlds across the month of November. In this brief post, I’ll bring you just a small cross-section of the data science meme on the Interwebs in the past 4 weeks:
Last week at Strata + Hadoop World 2012, we announced a new data science training and certification program. I am very excited to have been part of the team that put the program together, and I would like to answer some of the most frequently asked questions about the course and the certification that we will be offering.
Why is Cloudera offering data science training?
The primary bottleneck on the success of Hadoop is the number of people who are capable of using it effectively to solve business problems. Addressing that bottleneck with training has always been a very large part of our mission here at Cloudera, and we are very fortunate to have one of the best training teams anywhere. So far, we have trained over 15,000 Hadoop developers and administrators, and our courses and certification exams are available all over the world.
Right now, one of the biggest barriers to the widespread adoption of Hadoop is the supply of data scientists, the peculiar blend of software engineer and statistician that is capable of turning data into awesome. We’ve started to see data science courses develop at universities like Columbia, The University of Washington, and UC Berkeley (taught by Cloudera co-founder Jeff Hammerbacher). While these courses provide excellent instruction to a new generation of data scientists, the instruction they provide is necessarily limited to the students who are enrolled in those institutions, and the need for data science training is much broader and much more immediate.