Cloudera Engineering Blog · Training Posts
To paraphrase Nate Silver: “There is lots of data coming. Who will speak for all this data?”
Nearly every day, I read new articles about how Big Data is “changing everything.” Data scientists are unlocking new approaches that help researchers find the cure for cancer, banks fight fraud, the police fight drug-related crimes, and fantasy sports leaguers fight each other.
In this installment of “Meet the Instructor,” we speak to St. Louis-based Nathan Neff, the Training Lead for Cloudera’s new Data Analyst course.
What is your role at Cloudera?
Cloudera’s new Parcels installation format has been released, and I’m excited to highlight just how useful (and mind-blowingly cool) it is to system administrators and anyone responsible for maintaining a CDH cluster.
If you haven’t read about or played with Parcels, they make components of the distribution significantly easier to manage, install, and upgrade. The new Parcel distribution format works with Cloudera Manager 4.5 and later. When you perform installations and upgrades using Parcels, you get access to new Cloudera Manager features such as:
For years, Cloudera has provided virtual machines that give you a working Apache Hadoop environment out-of-the-box. It’s the quickest way to learn and experiment with Hadoop right from your desktop.
We’re constantly updating and improving the QuickStart VM, and in the latest release there are two of Cloudera’s new products that give you easier and faster access to your data: Cloudera Search and Cloudera Impala. We’ve also added corresponding applications to Hue – an open source web-based interface for Hadoop, and the easiest way to interact with your data.
Data analysts and business intelligence specialists have been at the heart of new trends driving business growth over the past decade, including log file and social media analytics. However, Big Data heretofore has been beyond the reach of analysts because traditional tools like relational databases don’t scale, and scalable systems like Apache Hadoop have historically required Java expertise.
Today Cloudera announced a new Cloudera Academic Partnership program, in which participating universities worldwide get access to curriculum, training, certification, and software.
As noted in the press release, the global demand for people with Apache Hadoop and data science skills is dwarfing all supply. We consider it an important mission to help accredited universities meet that demand, by equipping them with the content and training they need to educate students in the Hadoop arts.
A World-Class EDW Requires a World-Class Hadoop Team
Persado is the global leader in persuasion marketing technology, a new category in digital marketing. Our revolutionary technology maps the genome of marketing language and generates the messages that work best for any customer and any product at any time. To assure the highest quality experience for both our clients and end-users, our engineering team collaborates with Ph.D. statisticians and data analysts to develop new ways to segment audiences, discover content, and deliver the most relevant and effective marketing messages in real time.
Data scientists drive data as a platform to answer previously unimaginable questions. These multi-talented data professionals are in demand like never before because they identify or create some of the most exciting and potentially profitable business opportunities across industries. However, a scarcity of existing external talent will require companies of all sizes to find, develop, and train their people with backgrounds in software engineering, statistics, or traditional business intelligence as the next generation of data scientists.
Join us for the premiere of Training a New Generation of Data Scientists on Tuesday, March 26, at 2pm ET/11am PT. In this video, Cloudera’s Senior Director of Data Science, Josh Wills, will discuss what data scientists do, how they think about problems, the relationship between data science and Hadoop, and how Cloudera training can help you join this increasingly important profession. Following the video, Josh will answer your questions about data science, Hadoop, and Cloudera’s Introduction to Data Science: Building Recommender Systems course.
This guest post is provided by Rohit Menon, Product Support and Development Specialist at Subex.
I am a software developer in Denver and have been working with C#, Java, and Ruby on Rails for the past six years. Writing code is a big part of my life, so I constantly keep an eye out for new advances, developments, and opportunities in the field, particularly those that promise to have a significant impact on software engineering and the industries that rely on it.
In my current role working on revenue assurance products in the telecom space for Subex, I have regularly heard from customers that their data is growing at tremendous rates and becoming increasingly difficulty to process, often forcing them to portion out data into small, more manageable subsets. The more I heard about this problem, the more I realized that the current approach is not a solution, but an opportunity, since companies could clearly benefit from more affordable and flexible ways to store data. Better query capability on larger data sets at any given time also seemed key to derive the rich, valuable information that helps drive business. Ultimately, I was hoping to find a platform on which my customers could process all their data whenever they needed to. As I delved into this Big Data problem of managing and analyzing at mega-scale, it did not take long before I discovered Apache Hadoop.
Mission: Hands-On Hadoop
In this installment of “Meet the Instructor,” we speak to San Francisco-based Glynn Durham, one of the big brains behind Cloudera’s Introduction to Data Science training and certification.
What is your role at Cloudera?
I am a Senior Instructor with Cloudera University, which means I am a road warrior: I will travel anywhere to teach anything to anyone. I teach all the courses Cloudera offers, including custom private training events that I run at customer sites. Right now, I’m especially enjoying teaching Cloudera’s new course, Introduction to Data Science: Building Recommender Systems. In tandem with the rollout of the course, we’re developing Cloudera Certified Professional: Data Scientist exams, which will include a challenging performance-based lab component in addition to the written test.