Category Archives: Training

Big Data Architecture Workshop

Categories: Training

Since the birth of big data, Cloudera University has been teaching developers, administrators, analysts, and data scientists how to use big data technologies. We have taught over 50,000 folks all of the details of using technologies from Apache such as HDFS, MapReduce, Hive, Impala, Sqoop, Flume, Kafka, Core Spark, Spark SQL, Spark Streaming, and Spark MLlib.

For administrators we’ve taught them how to plan, install, monitor, and troubleshoot clusters. For analysts we have shown them the power of SQL over large, diverse data sets.

Read more

Up and running with Apache Spark on Apache Kudu

Categories: CDH Data Ingestion Data Science General Hadoop How-to Impala Kudu Spark Training Use Case

After the GA of Apache Kudu in Cloudera CDH 5.10, we take a look at the Apache Spark on Kudu integration, share code snippets, and explain how to get up and running quickly, as Kudu is already a first-class citizen in Spark’s ecosystem.

 

As the Apache Kudu development team celebrates the initial 1.0 release launched on September 19, and the most recent 1.2.0 version now GA as part of Cloudera’s CDH 5.10 release,

Read more

New Cloudera Search Training: Learn Powerful Techniques for Full-Text Search on an EDH

Categories: Search Training

Cloudera Search combines the speed of Apache Solr with the scalability of CDH. Our newest training course covers this exciting technology in depth, from indexing to user interfaces, and is ideal for developers, analysts, and engineers who want to learn how to effectively search both structured and unstructured data at scale.

Despite being nearly 10 years old, Apache Hadoop already has an interesting history. Some of you may know that it was inspired by the Google File System and MapReduce papers,

Read more

New Apache Spark Developer Training: Beyond the Basics

Categories: Spark Training

While the new Spark Developer training from Cloudera University is valuable for developers who are new to Big Data, it’s also a great call for MapReduce veterans.

When I set out to learn Apache Spark (which ships inside Cloudera’s open source platform) about six months ago, I started where many other people do: by following the various online tutorials available from UC Berkeley’s AMPLab, the creators of Spark. I quickly developed an appreciation for the elegant,

Read more

Meet the Data Scientist: Alan Paulsen

Categories: Data Science Training

Meet Alan Paulsen, among the first to earn the CCP: Data Scientist distinction.

Big Data success requires professionals who can prove their mastery with the tools and techniques of the Apache Hadoop stack. However, experts predict a major shortage of advanced analytics skills over the next few years. At Cloudera, we’re drawing on our industry leadership and early corpus of real-world experience to address the Big Data talent gap with the Cloudera Certified Professional (CCP) program.

Read more