Category Archives: QuickStart VM

Multi-node Clusters with Cloudera QuickStart for Docker

Categories: CDH QuickStart VM

Getting hands-on with a multi-node cluster for self-learning or testing is even easier, now.

Last December, we introduced the Cloudera QuickStart Docker image to make it easier than ever before to explore Cloudera’s distributed data processing platform, including tools such as Apache Impala (incubating), Apache Spark, and Apache Solr. While the single-node getting-started image was well-received, we noted a large number of requests from the community for a multi-node CDH deployment via Docker.

Read More

Docker is the New QuickStart Option for Apache Hadoop and Cloudera

Categories: CDH Ops and DevOps QuickStart VM Testing

Now there’s an even quicker “QuickStart” option for getting hands-on with the Apache Hadoop ecosystem and Cloudera’s platform: a new Docker image.

docker-logoYou might already be familiar with Cloudera’s popular QuickStart VM, a virtual image containing our distributed data processing platform. Originally intended as a demo environment, the QuickStart VM quickly evolved over time into quite a useful general-purpose environment for developers, customers,

Read More

How-to: Quickly Configure Kerberos for Your Apache Hadoop Cluster

Categories: How-to QuickStart VM Security

Use the scripts and screenshots below to configure a Kerberized cluster in minutes.

Kerberos is the foundation of securing your Apache Hadoop cluster. With Kerberos enabled, user authentication is required. Once users are authenticated, you can use projects like Apache Sentry (incubating) for role-based access control via GRANT/REVOKE statements.

Taming the three-headed dog that guards the gates of Hades is challenging, so Cloudera has put significant effort into making this process easier in Hadoop-based enterprise data hubs. 

Read More

How-to: Create a Simple Hadoop Cluster with VirtualBox

Categories: CDH Guest Hadoop QuickStart VM

(Editor’s note [Aug. 2, 2016]: A multi-cluster option for Docker-based deployment is now available for CDH 5.8 and later.)

Thanks to Christian Javet for his permission to republish his blog post below!

I wanted to get familiar with the big data world, and decided to test Hadoop. Initially, I used Cloudera’s pre-built virtual machine with its full Apache Hadoop suite pre-configured (called Cloudera QuickStart VM),

Read More

NYU, Analytics, and Cloudera’s QuickStart VM

Categories: Hadoop QuickStart VM Training

The Cloudera QuickStart VM is an important platform for learning any Hadoop-related curriculum.

In the Fall 2013 semester, more than 30 NYU graduate students completed the Real-time and Big Data Analytics course at the NYU Courant Institute of Mathematical Sciences, for which I served as instructor.

In this introductory analytics course, students learn the architectures of the Apache Hadoop storage and compute systems (HDFS and MapReduce respectively).

Read More