Category Archives: QuickStart VM

Docker is the New QuickStart Option for Apache Hadoop and Cloudera

Categories: CDH Ops and DevOps QuickStart VM Testing

Now there’s an even quicker “QuickStart” option for getting hands-on with the Apache Hadoop ecosystem and Cloudera’s platform: a new Docker image.

docker-logoYou might already be familiar with Cloudera’s popular QuickStart VM, a virtual image containing our distributed data processing platform. Originally intended as a demo environment, the QuickStart VM quickly evolved over time into quite a useful general-purpose environment for developers, customers,

Read More

How-to: Quickly Configure Kerberos for Your Apache Hadoop Cluster

Categories: How-to QuickStart VM Security

Use the scripts and screenshots below to configure a Kerberized cluster in minutes.

Kerberos is the foundation of securing your Apache Hadoop cluster. With Kerberos enabled, user authentication is required. Once users are authenticated, you can use projects like Apache Sentry (incubating) for role-based access control via GRANT/REVOKE statements.

Taming the three-headed dog that guards the gates of Hades is challenging, so Cloudera has put significant effort into making this process easier in Hadoop-based enterprise data hubs. 

Read More

How-to: Create a Simple Hadoop Cluster with VirtualBox

Categories: CDH Guest Hadoop QuickStart VM

Set up a CDH-based Hadoop cluster in less than an hour using VirtualBox and Cloudera Manager.

Thanks to Christian Javet for his permission to republish his blog post below!

I wanted to get familiar with the big data world, and decided to test Hadoop. Initially, I used Cloudera’s pre-built virtual machine with its full Apache Hadoop suite pre-configured (called Cloudera QuickStart VM),

Read More

NYU, Analytics, and Cloudera’s QuickStart VM

Categories: Hadoop QuickStart VM Training

The Cloudera QuickStart VM is an important platform for learning any Hadoop-related curriculum.

In the Fall 2013 semester, more than 30 NYU graduate students completed the Real-time and Big Data Analytics course at the NYU Courant Institute of Mathematical Sciences, for which I served as instructor.

In this introductory analytics course, students learn the architectures of the Apache Hadoop storage and compute systems (HDFS and MapReduce respectively).

Read More

How-to: Use Eclipse with MapReduce in Cloudera’s QuickStart VM

Categories: How-to MapReduce QuickStart VM

One of the common questions I get from students and developers in my classes relates to IDEs and MapReduce: How do you create a MapReduce project in Eclipse and then debug it?

To answer that question, I have created a screencast showing you how, using Cloudera’s QuickStart VM. The QuickStart VM helps developers get started writing MapReduce code without having to worry about software installs and configuration. Everything is installed and ready to go. 

Read More