Blog Posts
FAQ: Understanding the Parcel Binary Distribution Format
Philip Langdale
May 22, 2013
Excerpt: Have you ever wished you could upgrade to the latest CDH minor release with just a few mouse clic... more
If It’s Tuesday, There Must Be a "Data Ride"
Doug Cutting (@cutting)
May 21, 2013
Excerpt: Mark your calendars, all you data cyclists! I’m visiting Paris, London, and Edinburgh t... more
Customer Spotlight: Gravity Creates Personalized Web Experience, 300-400% Higher Click-through
Karina Babcock (@karinababcock)
May 20, 2013
Excerpt: According to Jim Benedetto,... more
How-to: Configure Eclipse for Hadoop Contributions
Justin Kestelyn (@kestelyn)
May 15, 2013
Excerpt: Contributing to Apache Hadoop or writing custom pluggable modules requires modifying Hadoop’s s... more
May 13, 2013
Excerpt: One of the complexities of Apache Hadoop is the need to deploy clusters of servers, potentially o... more
Tracking Hadoop Jobs from Your Mac: There’s an App for That
Justin Kestelyn (@kestelyn)
May 10, 2013
Excerpt: Our thanks to Etsy developer Brad Greenlee (@bgreenlee) for the post below. We think his Mac... more
Metrics2: The New Hotness for Apache HBase Metrics
Justin Kestelyn (@kestelyn)
May 8, 2013
Excerpt: The post below was originally published at... more
Cloudera Partners and Impala: Alteryx
Justin Kestelyn (@kestelyn)
May 8, 2013
Excerpt: Our thanks to Brian Dirking, Director of Product Marketing for... more
Extending the Data Warehouse with Hadoop
Justin Kestelyn (@kestelyn)
May 7, 2013
Excerpt: "Are data warehouses becoming victims of their own success?", Tony Baer asks in a ... more
Cloudera Development Kit (CDK): Hadoop Application Development Made Easier
Justin Kestelyn (@kestelyn)
May 7, 2013
Excerpt: At Cloudera, we have the privilege of helping thousands of developers learn Apache Hadoop, as wel... more
Cloudera Impala and Partners: Tableau
Justin Kestelyn (@kestelyn)
May 7, 2013
Excerpt: Our thanks to Ted Wasserman, product manager for Ta... more
Customer Spotlight: Sneak Peek into Skybox Imaging’s Cloudera-powered Satellite System
Justin Kestelyn (@kestelyn)
May 6, 2013
Excerpt: This week, the Cloudera Sessions... more
Cloudera Partners and Impala: Talend
Justin Kestelyn (@kestelyn)
May 6, 2013
Excerpt: Our thanks to Yves de Montcheuil, Vice President of Marketing for... more
Cloudera Partners and Impala: MicroStrategy
Justin Kestelyn (@kestelyn)
May 3, 2013
Excerpt: Our thanks to Kevin Spurway, Senior Vice President of Marketing for... more
Customer Spotlight: Six3 Systems’ Wayne Wheeles Drives Cyber Security Innovation using Impala
Justin Kestelyn (@kestelyn)
May 2, 2013
Excerpt: This week represents quite a milestone for Cloudera and, at least we’d like to believe, the Had... more
How the SAS and Cloudera Platforms Work Together
Justin Kestelyn (@kestelyn)
May 2, 2013
Excerpt: On Monday April 29, Cloudera... more
Cloudera Impala 1.0: It’s Here, It’s Real, It’s Already the Standard for SQL on Hadoop
Justin Kestelyn (@kestelyn)
May 1, 2013
Excerpt: In October 2012,... more
The Platform for Big Data is Here
Charles Zedlewski
April 30, 2013
Excerpt: It has been an exciting couple of days for new product announcements at Cloudera -- exciting espe... more
What’s New in Hue 2.3
Romain Rigaux
April 26, 2013
Excerpt: We're very happy to announce the 2.3 release of Hue, the open source... more
How Scaling Really Works in Apache HBase
Matteo Bertozzi
April 26, 2013
Excerpt: This post was originally published via blogs.apache.... more
Meet the Project Founder: Doug Cutting (First in a Series)
Justin Kestelyn (@kestelyn)
April 25, 2013
Algorithms Every Data Scientist Should Know: Reservoir Sampling
Josh Wills (@josh_wills)
April 23, 2013
Excerpt: Data scientists, that peculiar... more
Customer Spotlight: Nokia’s Big Data Ecosystem Connects Cloudera, Teradata, Oracle, and Others
Justin Kestelyn (@kestelyn)
April 22, 2013
Excerpt: As Cloudera’s keeper of customer stories, it’s dawned on me that others might benefit from th... more
HBaseCon 2013 Speakers, Tracks, and Sessions Announced
Justin Kestelyn (@kestelyn)
April 22, 2013
Excerpt: Thanks to a dazzling array of excellent proposals from across the Apache HBase community, the... more
Cloudera Academic Partnership Program: Creating Hadoop Lovers in Universities Worldwide
Justin Kestelyn (@kestelyn)
April 18, 2013
Excerpt: Today Cloudera announced a new... more
Learn How To Hadoop from Tom White in Dr. Dobb’s
Justin Kestelyn (@kestelyn)
April 18, 2013
How Persado Supports Persuasion Marketing Technology with Hive and Pig Training
Ryan Goldman (@ClouderaU)
April 17, 2013
Excerpt: This guest post comes from Alex Giamas, Senio... more
It’s Only Rock and Roll
Doug Cutting
April 15, 2013
Excerpt: It’s only Rock and Roll, but I like it! - Mick Jagger... more
How-to: Use the Apache HBase REST Interface, Part 2
Jesse Anderson (@jessetanderson)
April 12, 2013
Excerpt: This how-to is the second in a series that explores the use of the Apache HBase REST interface. ... more
Where to Find Cloudera Tech Talks Through June 2013
Justin Kestelyn (@kestelyn)
April 11, 2013
Excerpt: It's time for me to give you a quarterly update (... more
How-to: Use Vagrant to Set Up a Virtual Hadoop Cluster
Justin Kestelyn (@kestelyn)
April 9, 2013
Excerpt: This guest post comes to us from David Greco, CTO of Elig... more
Demo: HDFS File Operations Made Easy with Hue
Romain Rigaux
April 8, 2013
Excerpt: Managing and viewing data in... more
Congrats to OSCON 2013 Speakers!
Justin Kestelyn (@kestelyn)
April 5, 2013
Excerpt: Cloudera will be a proud exhibitor at O'... more
For a Limited Time: Live Impala Demo on EC2
Justin Kestelyn (@kestelyn)
April 4, 2013
Excerpt: As a follow-up to a previous post about the Impala demo he built during Data Hacking Day, Ala... more
Cloudera is the Top Big Data Influencer in Social Media
Justin Kestelyn (@kestelyn)
March 29, 2013
Excerpt: Thanks to our friends at KDNuggets for pointing out that Cloudera is the... more
Meet the HBaseCon 2013 Program Committee
Justin Kestelyn (@kestelyn)
March 29, 2013
Excerpt: With HBaseCon 2013 (Early Bird registration now open!) pre... more
Phoenix in 15 Minutes or Less
Justin Kestelyn (@kestelyn)
March 28, 2013
Excerpt: The following FAQ is provided by James Taylor of Salesforce, which recently open-sourced its... more
How-to: Create a CDH Cluster on Amazon EC2 via Cloudera Manager
Justin Kestelyn (@kestelyn)
March 26, 2013
Excerpt: Cloudera Man... more
How-to: Analyze Twitter Data with Hue
Romain Rigaux
March 25, 2013
Excerpt: Hue 2.2 , the open sour... more
One User’s Impala Experience at Data Hacking Day
Justin Kestelyn (@kestelyn)
March 25, 2013
Excerpt: The following guest post comes to you from Alan Gardner of remote database services and consu... more
Cloudera’s Jeff Hammerbacher on Charlie Rose
Justin Kestelyn (@kestelyn)
March 22, 2013
Excerpt: In this... more
Cloudera ML: New Open Source Libraries and Tools for Data Scientists
Josh Wills (@josh_wills)
March 22, 2013
Excerpt: Last month, Apache Crunch became the fifth project (along... more
How-to: Import a Pre-existing Oozie Workflow into Hue
Abraham Elmahrek
March 20, 2013
Excerpt: Hue is an open-source web interface for Apache Hado... more
How-to: Use Oozie Shell and Java Actions
Justin Kestelyn (@kestelyn)
March 18, 2013
Excerpt: Apache Oozie, the workflow coordinator for Apache Hadoop, h... more
Cloudera Speakers at Hadoop Summit Europe
Justin Kestelyn (@kestelyn)
March 15, 2013
Excerpt: Hadoop Summit Europe is coming up in Amsterdam n... more
Introducing Parquet: Efficient Columnar Storage for Apache Hadoop
Justin Kestelyn (@kestelyn)
March 13, 2013
Excerpt: Below you'll find the official announcement from Cloudera and Twitter about Parquet, an effic... more
How-to: Use the Apache HBase REST Interface, Part 1
Jesse Anderson (@jessetanderson)
March 12, 2013
Excerpt: There are various ways to access and interact with Apache HBase. The... more
What the Hack! The Story of the Cloudera Hackathon
Justin Kestelyn (@kestelyn)
March 8, 2013
Excerpt: Every growing, dynamic engineering culture needs a hackathon every once in a while. Ear... more
Introduction to Apache HBase Snapshots
Matteo Bertozzi
March 7, 2013
Excerpt: The current (4.2) release of CDH -- Cloudera's 100% open-source distribution of Apache Hadoop and... more
How-to: Set Up Cloudera Manager 4.5 for Apache Hive
Justin Kestelyn (@kestelyn)
March 6, 2013
Excerpt: Last week Cloudera released the 4.5 release of... more
How-to: Set Up a Hadoop Cluster with Network Encryption
Alejandro Abdelnur
March 5, 2013
Excerpt: Hadoop network encryption is a feature introduced in Apache Hadoop 2.0.2-alpha and in CDH4.1.... more
What’s New in Hue 2.2?
Romain Rigaux
March 1, 2013
Excerpt: This post is about the new release of Hue, an open... more
What’s New in Cloudera Manager 4.5?
Bala Venkatrao
February 27, 2013
Excerpt: It has been a while since I have blogged, primarily because we have been heads-down working towar... more
Open Source, Flattery, and The Platform for Big Data
Charles Zedlewski
February 26, 2013
Excerpt: It has been a busy time for announcements coinciding with this week’s Strata conference. There... more
February 25, 2013
Excerpt: UPDATED 20130424: The new RHadoop treats output to Streaming a bit differently,... more
Call for Speakers and Early Bird Registration: HBaseCon 2013
Justin Kestelyn (@kestelyn)
February 21, 2013
Excerpt: (Added Feb. 25 2013: Early Bird registration is now open - closes April 23, 2013!)... more
February 21, 2013
Excerpt: Now that Apache Hadoop is seven years old, use-case patterns for Big Data have emerged. In this p... more
Apache Hadoop 2.0.3-alpha Released
Tom White
February 20, 2013
Excerpt: Last week the Apache Hadoop PMC voted to release... more
Cloudera Speakers at Big Data TechCon (+ $200 Off Registration)
Justin Kestelyn (@kestelyn)
February 15, 2013
Excerpt: Cloudera is proud to be a sponsor of Big Data... more
The Cloudera Sessions Across the U.S.: Where Are You on the Road to Big Data?
Justin Kestelyn (@kestelyn)
February 14, 2013
Excerpt: Organizations of all types and sizes are waking up to the idea that integrating the Apache Hadoop... more
From Zero to Impala in Minutes
Justin Kestelyn (@kestelyn)
February 7, 2013
Excerpt: This was post was originally published by U.C. Berk... more
February 6, 2013
How Syncsort Leverages Training to Optimize Hadoop Scalability
Justin Kestelyn (@kestelyn)
February 6, 2013
Excerpt: This guest post is provided by Dave Nahmias, Pre-Sales and Partner Solutions Engineer at... more
A Ruby Client for Impala
Justin Kestelyn (@kestelyn)
February 4, 2013
Excerpt: Thanks to Stripe's Colin Marc (@colinmarc) for the guest post below, and for his work on the... more
Understanding MapReduce via Boggle, Part 2: Performance Optimization
Jesse Anderson (@jessetanderson)
January 30, 2013
Excerpt: In Part 1... more
Webinar: Introduction to Hadoop Developer Training (Jan. 31)
Ryan Goldman (@ClouderaU)
January 28, 2013
Excerpt: Are you new to Apache Hadoop and need to start processing data fast and effectively? Have you bee... more
Where to Find Cloudera Tech Talks in Early 2013
Justin Kestelyn (@kestelyn)
January 22, 2013
Excerpt: Clouderans are traveling the United States (and beyond) in droves during the first quarter of 201... more
Cloudera Impala Beta (version 0.4) and Cloudera Manager 4.1.3 Now Available
Justin Kestelyn (@kestelyn)
January 18, 2013
Excerpt: I am pleased to announce the release of Cloudera Impala Beta (version 0.4) and Cloudera Manager 4... more
How-To: Schedule Recurring Hadoop Jobs with Apache Oozie
Justin Kestelyn (@kestelyn)
January 18, 2013
Excerpt: Our thanks to guest author Jon Natkins (@nattyice) of WibiData for the following post!... more
Understanding MapReduce via Boggle
Jesse Anderson (@jessetanderson)
January 14, 2013
Excerpt: Graph theory is a growing part of Big Dat... more
How-to: Do Apache Flume Performance Tuning (Part 1)
Mike Percy (@mike_percy)
January 11, 2013
Excerpt: The post below was originally published via ... more
Apache Hadoop in 2013: The State of the Platform
Rob Weltman
January 10, 2013
Excerpt: For several good reasons, 2013 is a Happy New Year for Apache Hadoop enthusiasts. In 2012... more
Data Hacking Day with Cloudera (Feb. 25, Palo Alto)
Justin Kestelyn (@kestelyn)
January 9, 2013
Excerpt: (Update 2/6/2013 - Sorry, this event is sold out!) With... more
Get a Free Hadoop Operations Ebook with Administrator Training
Ryan Goldman (@ClouderaU)
January 8, 2013
A Guide to Python Frameworks for Hadoop
Uri Laserson
January 7, 2013
Excerpt: I recently joined Cloudera after working in... more
Apache Bigtop 0.5.0 Has Been Released
Justin Kestelyn (@kestelyn)
January 3, 2013
Excerpt: The following post was originally published via... more
The Dynamic Workflow Builder in Hue
Abraham Elmahrek
January 3, 2013
Excerpt: Hue is a web interface for... more
How-to: Use the ShareLib in Apache Oozie
Justin Kestelyn (@kestelyn)
December 18, 2012
Excerpt: As Apache Oozie, the workflow engine for Apache Hadoop, con... more
What’s Next for Cloudera Impala?
Justin Erickson
December 14, 2012
Excerpt: It’s been an exciting month and a half since the launch of the Cloudera Impala (the new open so... more
How-To: Run a MapReduce Job in CDH4
Sandy Ryza
December 14, 2012
Excerpt: This is the first post in series that will get you going on how to write, compile, and run a simp... more
Cloudera Speakers at ApacheCon NA 2013
Justin Kestelyn (@kestelyn)
December 13, 2012
Excerpt: Our hearty congratulations to the Cloudera engineers who have been accepted as... more
Secrets of Cloudera Support: The Champagne Strategy
Justin Kestelyn (@kestelyn)
December 11, 2012
Excerpt: At Cloudera, we put great pride into drinking our own champagne. That pride extends to our suppor... more
How-to: Manage Permissions in Hue
Abraham Elmahrek
December 7, 2012
Excerpt: Hue is a web interface for... more
Introducing Cloudera CDH4 Certification
Ryan Goldman (@ClouderaU)
December 6, 2012
Excerpt: We are very pleased to introduce new, CDH4.1-aligned versions of the... more
New: Cloudera Manager Free Edition Demo VM
Justin Kestelyn (@kestelyn)
December 5, 2012
Excerpt: With the... more
Cloudera Impala Beta (version 0.3) and Cloudera Manager 4.1.2 Now Available
Justin Kestelyn (@kestelyn)
December 4, 2012
Excerpt: I am pleased to announce the release of Cloudera Impala Beta (version 0.3) and Cloudera Manager 4... more
Meet the Engineer-Turned-Product Manager: Eva Andreasson
Justin Kestelyn (@kestelyn)
November 30, 2012
Apache HBase AssignmentManager Improvements
Jimmy Xiang
November 28, 2012
Excerpt: AssignmentManager is a module in the Apache HBase... more
External Hands-on Experiences with Cloudera Impala
Justin Kestelyn (@kestelyn)
November 28, 2012
Streaming Data into Apache HBase using Apache Flume
Hari Shreedharan
November 27, 2012
Excerpt: The following post was... more
This Month in Data Science
Justin Kestelyn (@kestelyn)
November 27, 2012
Excerpt: Data science has been a ubiquitous topic of conversation in the IT and business worlds across the... more
Introducing Hannibal: A Tool for Apache HBase Region Monitoring
Justin Kestelyn (@kestelyn)
November 26, 2012
Excerpt: The following is a guest post from Nils Kübler, the creator of the Hannibal project. He is s... more
The Winner of the 2012 Government Big Data Solutions Award is the National Cancer Institute
Justin Kestelyn (@kestelyn)
November 20, 2012
Excerpt: The following is a re-post from... more
The "Ask Bigger Questions" Contest!
Justin Kestelyn (@kestelyn)
November 19, 2012
Excerpt: Have you helped your company ask bigger questions? Our mission at Cloudera University is to equip... more
Apache ZooKeeper 3.4.5 Has Been Released
Skye Wanderman-Milne
November 19, 2012
Excerpt: Apache ZooKeeper release 3.4.5 is now available. This... more
Dive Into Cloudera Impala at a Meetup Near You
Justin Kestelyn (@kestelyn)
November 14, 2012
Excerpt: Since the... more
Cloudera Impala Beta (version 0.2) and Cloudera Manager 4.1.1 Now Available
Justin Kestelyn (@kestelyn)
November 13, 2012
Excerpt: I am pleased to announce the release of Cloudera Impala Beta (version 0.2) and Cloudera Manager 4... more
Analyzing Twitter Data with Apache Hadoop, Part 3: Querying Semi-structured Data with Apache Hive
Jonathan Natkins (@nattybnatkins)
November 13, 2012
Excerpt: This is the third article in a series about analyzing Twitter data using some of the components o... more
What’s New in Apache Sqoop 1.4.2
Jarek Jarcec Cecho
November 7, 2012
Excerpt: (The following is a... more
See You at Data Science Day (Nov. 29, New York)!
Justin Kestelyn (@kestelyn)
November 6, 2012
Excerpt: [Updated Nov. 26, 2012: Sorry, this event has reached capacity and is now closed.]... more
How to Get Rich on Big Data
Mike Olson
November 5, 2012
Excerpt: The 2012 Strata + Hadoop World conference was w... more
The New "Hadoop in Practice" Book: A Chat with The Author
Justin Kestelyn (@kestelyn)
November 5, 2012
Mike Olson at FutureBI Meetup (Berkeley, Nov. 6)
Justin Kestelyn (@kestelyn)
November 1, 2012
Training a New Generation of Data Scientists
Josh Wills (@josh_wills)
October 31, 2012
Excerpt: Last week at Strata + Hadoop World 2... more
Quorum-based Journaling in CDH4.1
Todd Lipcon (@tlipcon)
October 31, 2012
Excerpt: A few weeks back, Cloudera announced CDH 4.1, the latest update release to Cloudera's Distributio... more
Cloudera Impala: Real-Time Queries in Apache Hadoop, For Real
Justin Kestelyn (@kestelyn)
October 24, 2012
Excerpt: After a long period of intense engineering effort and user feedback, we are very pleased, and pro... more
Cloudera, The Platform for Big Data
Charles Zedlewski
October 24, 2012
Excerpt: Today we’re proud to announce a new addition to the Apache Hadoop ecosystem:... more
MR2 and YARN Briefly Explained
Justin Kestelyn (@kestelyn)
October 24, 2012
Excerpt: With CDH4 onward, the Apache Hadoop component introduced two new terms for Hadoop users to wonder... more
Your Guide to Cloudera @ Strata + Hadoop World This Week
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: Cloudera is co-presenting the sold-out... more
Sneak Peek into Skybox Imaging’s Cloudera-powered Satellite System
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: This is a guest post by Oliver Guinan, VP Ground Software, at Skybox Imaging. Oliver is a 15-... more
Apache Hadoop 2.0.2-alpha Released
Tom White
October 21, 2012
Excerpt: Earlier this month the Apache Hadoop PMC released... more
What’s New in CDH4.1 Hue
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: Hue is a Web-based interface that makes it easier t... more
What’s New in CDH4.1 Pig
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: Apache Pig is a platform for analyzing large data sets that... more
Axemblr’s Java Client for the Cloudera Manager API
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: Axemblr, purveyors of a cloud-agnostic MapReduce Web Service, h... more
Analyzing Twitter Data with Apache Hadoop, Part 2: Gathering Data with Flume
Jonathan Natkins (@nattybnatkins)
October 21, 2012
Excerpt: This is the second article in a series about analyzing Twitter data using some of the components... more
HBase at ApacheCon Europe 2012
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: Apache HBase will have a notable profile at ApacheCon Europe... more
New Additions to the Apache HBase Team
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: StumbleUpon (SU) and Cloudera have signed a technology collaboration agreement. Cloudera will sup... more
How-to: Set Up an Apache Hadoop/Apache HBase Cluster on EC2 in (About) an Hour
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: Today we bring you one user's experience using Apache Whirr to spin up a CDH cluster in the c... more
October 21, 2012
Excerpt: Our video animation factory has been busy lately. The embedded player below contains our two late... more
Data Science: The New Heart of Healthcare
Josh Wills (@josh_wills)
October 21, 2012
Excerpt: We at Cloudera are tremendously excited by the power of data to effect large-scale change in the... more
What is Hadoop Metrics2?
Ahmed Radwan
October 21, 2012
Excerpt: Metrics are collections of information about Hadoop daemons, events and measurements; for example... more
MR2 and YARN Briefly Explained
Justin Kestelyn (@kestelyn)
October 21, 2012
Excerpt: With CDH4 onward, the Apache Hadoop component introduced two new terms for Hadoop users to wonder... more
Applying Parallel Prediction to Big Data
Justin Kestelyn (@kestelyn)
October 5, 2012
Excerpt: This guest post is provided by Dan McClary, Principal Product Manager for Big Data and H... more
Data Science: Hot or Not?
Justin Kestelyn (@kestelyn)
October 4, 2012
Excerpt: You may have noticed that Harvard Business Review is calling data science... more
CDH4.1 Now Released!
Charles Zedlewski
October 1, 2012
Excerpt: Update time! As a reminder, Cloudera releases major versions of CDH, our 100% open source distr... more
About Apache Flume FileChannel
Brock Noland
September 27, 2012
Excerpt: The post below was originally published via... more
September 25, 2012
Excerpt: With the default Apache HBase configuration, everyone is a... more
Apache ZooKeeper 3.4.4 Has Been Released!
Justin Kestelyn
September 24, 2012
Excerpt: Apache ZooKeeper release 3.4.4 is now... more
Schedule This! Strata + Hadoop World Speakers from Cloudera
Justin Kestelyn
September 24, 2012
Analyzing Twitter Data with Apache Hadoop
Jonathan Natkins (@nattybnatkins)
September 19, 2012
Excerpt: Social media has gained immense popularity with marketing teams, and Twitter is an effective tool... more
Exploring Compression for Hadoop: One DBA’s Story
Justin Kestelyn
September 14, 2012
Excerpt: This guest post comes to us courtesy of Gwen Shapira (@gwenshap), a database consultant for... more
Cloudera Enterprise in Less Than Two Minutes
Justin Kestelyn
September 11, 2012
Excerpt: What's to love about Cloudera Ent... more
September 10, 2012
Excerpt: API access was a new feature introduced in Cloudera Manager 4.0 (download free edition... more
September 5, 2012
Excerpt: Organizations in diverse industries have adopted Apache Hadoop-based systems for large-scale data... more
How-to: Develop CDH Applications with Maven and Eclipse
Jonathan Natkins
August 30, 2012
Excerpt: Learn how to configure a basic Maven project that will be able to build applications agai... more
Apache Hadoop on Your PC: Cloudera’s CDH4 Virtual Machine
Justin Kestelyn (@kestelyn)
August 27, 2012
Excerpt: Today ZDNet has very helpfully published a... more
Cloudera Manager 4.0.4 & Cloudera Manager 3.7.8 Released!
Justin Kestelyn
August 21, 2012
Excerpt: Cloudera Manager 4.0.4 and Cloudera Manager 3.7.8 are now available! These are enhancement releas... more
Process a Million Songs with Apache Pig
Justin Kestelyn
August 21, 2012
Excerpt: The following is a guest post kindly offered by Adam Kawa, a 26-year old Hadoop developer fro... more
Cloudera Software Engineer Eli Collins on Apache Hadoop and CDH4
Justin Kestelyn (@kestelyn)
August 20, 2012
Excerpt: In June 2012, Eli Collins (@elicollins), from Cloudera's Platforms team, led a session at... more
Apache HBase Replication: Operational Overview
Himanshu Vashishtha
August 16, 2012
Excerpt: This is the second blogpost about Apache HBase replication. The... more
Developer Community Outreach from Cloudera: Better, Faster, More
Justin Kestelyn (@kestelyn)
August 15, 2012
Excerpt: Hello World: This is my first post as the new guy facilitating and coordinating developer communi... more
CDH3 update 5 is now available
Arvind Prabhakar
August 13, 2012
Excerpt: We are happy to announce the general availability of CDH3 update 5. This update is a maintenance... more
HttpFS for CDH3 – The Apache Hadoop FileSystem over HTTP
Alejandro Abdelnur
August 7, 2012
Excerpt: HttpFS is an HTTP gateway/proxy for Apache Hadoop FileSystem implementations. HttpFS comes with C... more
Column Statistics in Apache Hive
Shreepadma Venugopalan
August 3, 2012
Excerpt: Over the last couple of months the Hive team at Cloudera has been working hard to bring a bunch o... more
Apache ZooKeeper 3.3.6 has been released
Patrick Hunt
August 2, 2012
Excerpt: Apache ZooKeeper release 3.... more
Processing Rat Brain Neuronal Signals Using an Apache Hadoop Computing Cluster – Part III
Jon Zuanich
August 2, 2012
Excerpt: Up to this point, we’ve described our reasons for using Hadoop and Hi... more
Processing Rat Brain Neuronal Signals Using an Apache Hadoop Computing Cluster – Part II
Jon Zuanich
August 1, 2012
Excerpt: Background As mentioned in... more
July 31, 2012
Excerpt: Introduction In this three-part series of posts, we will share our experiences tackling... more
Apache HBase Replication Overview
Himanshu Vashishtha
July 30, 2012
Excerpt: Apache HBase Replication is a way of copying data from one HBase cluster to a different and possi... more
Why we build our platform on HDFS
Charles Zedlewski
July 25, 2012
Excerpt: It’s not often the case that I have a chance to concur with my colleague E14 over at Hortonwork... more
Cloudera Manager 4.0.3 Released!
Bala Venkatrao
July 19, 2012
Excerpt: We are pleased to announce the availability of Cloudera Manager 4.0.3. This is an enhancement rel... more
Apache HBase Log Splitting
Jimmy Xiang
July 12, 2012
Excerpt: In the recent blog post about the... more
July 11, 2012
Excerpt: At 5 pm PDT on June 30, a leap second was added to the Universal Coordinated Time (UTC). Within a... more
July 9, 2012
Excerpt: This is a guest re-post from Datameer's Director of Marketing, Rich Taylor. The original post... more
Apache Flume Development Status Update
Hari Shreedharan
July 3, 2012
Excerpt: Apache Flume is a scalable, reliable, fault-tolerant, distributed system designed to collect, tra... more
Update on Apache Bigtop (incubating)
Charles Zedlewski
July 2, 2012
Excerpt: Introduction Ever since Cloudera decided to contribute the code and resources for what... more
Apache HBase I/O – HFile
Matteo Bertozzi
June 29, 2012
Excerpt: Introduction Apache HBase is the Hadoop open-source, distributed, versioned storage man... more
Apache Oozie (incubating) 3.2.0 release
Alejandro Abdelnur
June 29, 2012
Excerpt: This blog was originally posted on the... more
Apache Hadoop Beyond MapReduce, Part 1: Introducing Kitten
Josh Wills (@josh_wills)
June 26, 2012
Excerpt: This week, a team of researchers at Google will be presenting a paper describing a system they de... more
A Big Thank You to All Who Participated In Making HBaseCon and the HBase Hack-a-thon A Success
David S. Wang
June 19, 2012
Excerpt: HBaseCon 2012 summation provided by Michael Stack, PMC Chair of the Apache HBase Project. HBa... more
Apache HBase Write Path
Jimmy Xiang
June 18, 2012
Excerpt: Apache HBase is the Hadoop database, and is based on the Hadoop Distributed File... more
The Elephant in the Enterprise
Jon Zuanich
June 14, 2012
Excerpt: On Tuesday, June 12th The Churchill Club of Silicon Valley hosted a panel discussion on Hadoop's... more
June 11, 2012
Excerpt: Overview One of the major features of the upcoming Apache HBase 0.96 release is improve... more
CDH4 and Cloudera Enterprise 4.0 Now Available
Charles Zedlewski
June 5, 2012
Excerpt: I’m very pleased to... more
Online Apache HBase Backups with CopyTable
Jonathan Hsieh
June 4, 2012
Excerpt: CopyTable is a simple Apache HBase utility that, unsurprisingly, can be used for copying individu... more
Cloudera Manager 3.7.6 released!
Jon Zuanich
June 4, 2012
Excerpt: We are pleased to announce that Cloudera Manager 3.7.6 is now available! The most notable updates in... more
Apache HBase 0.94 is now released
Himanshu Vashishtha
May 16, 2012
Excerpt: Apache HBase 0.94.0 has been released! This is the first major release since the January 22nd HBa... more
Meet the Presenter: Todd Lipcon
Jon Zuanich
May 14, 2012
Excerpt: Today’s interview features Todd Lipcon, software engineer for Cloudera. Todd will be presenting... more
Cloudera Manager 4.0 Beta released
Aparna Ramani
May 14, 2012
Excerpt: We're happy to announce the Beta release of Cloudera Manager 4.0. This version of Clo... more
CDH3 update 4 is now available
David S. Wang
May 9, 2012
Excerpt: We are happy to officially announce the general availability of CDH3 update 4. This update consis... more
Announcing Apache Hive 0.9.0
Carl Steinbach
May 4, 2012
Excerpt: This past Monday marked the official release of Apache Hive 0.9.0. Users interested in taking t... more
May 3, 2012
Excerpt: This is a guest post by Assaf Yardeni, Head of R&D for Treato, an online social healthcar... more
Apache MRUnit 0.9.0-incubating has been released!
Brock Noland
May 1, 2012
Excerpt: This post was originally posted on the... more
April 25, 2012
Excerpt: HBaseCon 2012 is only a month away! The conference takes p... more
Introducing CDH4 Beta 2
Charles Zedlewski
April 24, 2012
Excerpt: I'm pleased to inform our users and customers that we have released the Cloudera's Distribution I... more
Constructing Case-Control Studies with Apache Hadoop
Josh Wills (@josh_wills)
April 11, 2012
Excerpt: San Francisco seems to be having an unusually high number of... more
Sqoop Graduation Meetup
Kathleen Ting
April 10, 2012
Excerpt: This blog was originally posted on the Apache Blog:... more
Apache HBase Hackathon at Cloudera
David S. Wang
April 6, 2012
Excerpt: Cloudera will be hosting an Apache HBase... more
Apache Bigtop 0.3.0 (incubating) has been released
Roman Shaposhnik
April 3, 2012
Excerpt: Apache Bigtop 0.3.0 (incubating) is now available. This is the first fully integrated, community-... more
Apache Sqoop Graduates from Incubator
Arvind Prabhakar
April 2, 2012
Excerpt: This blog was originally posted on the Apache Blog: ... more
Apache Hadoop Versions: Looking Ahead
Aaron Myers
April 1, 2012
Excerpt: Introduction A few months ago, my colleague Charles Zedlewski wrote a... more
March 2012 Bay Area HBase User Group meetup summary
David S. Wang
March 30, 2012
Excerpt: The... more
Apache HBase 0.92.1 now available
Shaneal Manek
March 23, 2012
Excerpt: What's new? Apache HBase 0.92.1 is now available... more
Apache ZooKeeper 3.3.5 has been released
Patrick Hunt
March 21, 2012
Excerpt: Apache ZooKeeper release 3.... more
Authorization and Authentication In Hadoop
Jonathan Natkins
March 20, 2012
Excerpt: One of the more confusing topics in Hadoop is how authorization and authentication work in the sy... more
Apache HBase 0.90.6 is now available
Jimmy Xiang
March 19, 2012
Excerpt: Apache HBase 0.90.6 is now available. It is a bug fix rele... more
Apache HBase + Apache Hadoop + Xceivers
Lars George
March 14, 2012
Excerpt: Introduction Some of the configuration properties found in Apache Hadoop have a direct... more
Real-Time Your Hadoop! Join us at HBaseCon 2012
Gretchen Malay
March 8, 2012
Excerpt: We’re excited to host the first ever HB... more
March 7, 2012
Excerpt: Background Apache Hadoop consists of two primary components: H... more
March 5, 2012
Excerpt: Cloudera and Cisco jointly announced a reference architecture for running Cloudera's Distribution... more
Indexing Files via Solr and Java MapReduce
Adam Smieszny
March 2, 2012
Excerpt: Several weeks ago, I set about to demonstrate the ease with which... more
Apache ZooKeeper 3.4.3 has been released
Patrick Hunt
February 14, 2012
Excerpt: Apache ZooKeeper release 3.4.3 is now available. This is a bug fix release covering 18 issues, one of whi... more
February 14, 2012
Excerpt: Service and Configuration Management (Part I & II) We’ve recently recorded a series of demo videos int... more
Introducing CDH4
Charles Zedlewski
February 13, 2012
Excerpt: I’m pleased to inform our users and customers that Cloudera has released its 4th version of Cloudera’s... more
Cloudera Connector for Tableau Has Been Released
Basier Aziz
February 7, 2012
Excerpt: Earlier today, Cloudera proudly released the Cloudera Connector for Tableau. The availability of this connect... more
CDH3, update 3 now available
Charles Zedlewski
January 30, 2012
Excerpt: Keeping with our release policy for Cloudera’s Distribution Including Apache Hadoop (CDH) I’m plea... more
January 25, 2012
Excerpt: More than 150 people attended the San Francisco Bay Area HBase User Group meetup last Thursday, January 19th,... more
January 25, 2012
Excerpt: When most people first hear about data science, it’s usually in the context of how prominent web compani... more
Apache HBase 0.92.0 has been released
Jonathan Hsieh
January 24, 2012
Excerpt: Today the Apache HBase community has proudly released Apache HBase 0.92.0, a major new version of the scalable... more
Hadoop World 2011 Videos and Slides Available
Jon Zuanich
January 18, 2012
Excerpt: Last November in New York City, Hadoop World, the largest conference of Apache Hadoop practitioners, developer... more
Apache Sqoop: Highlights of Sqoop 2
Kathleen Ting
January 13, 2012
Excerpt: This blog was originally posted on the Apache Blog: https://blogs.apache.org/sqoop/entry/apache_sqoop_highlig... more
Capacity Planning with Cloudera Manager
Jon Natkins
January 12, 2012
Excerpt: If you’re like a myriad of other systems administrators out there, you may be running a production Hadoo... more
Cloudera Manager – Thank You Customers!
Bala Venkatrao
January 11, 2012
Excerpt: Bala Venkatrao is the Director of Product Management at Cloudera . As many of you know, we recently launc... more
Oracle selects CDH and Cloudera Manager as the Apache Hadoop Platform for the Oracle Big Data Appliance
Ed Albanese
January 10, 2012
Excerpt: Cloudera users gain more choice, tighter Oracle integration. Cloudera partners gain increased validation of th... more
January 9, 2012
Excerpt: Great news! The InfoWorld Tech Center has chosen Apache Hadoop for a 2012 Technology of the Year Award . Judg... more
January 9, 2012
Excerpt: Great news! The InfoWorld Tech Center has chosen Apache Hadoop for a... more
Hadoop in 2011
Rob Weltman
January 9, 2012
Excerpt: 2011 was a breakthrough year for Apache Hadoop as many more mainstream organizations large and small turned to... more
Apache Hadoop in 2011
Rob Weltman
January 9, 2012
Excerpt: 2011 was a breakthrough year for Apache Hadoop as many more mainstream organizations large and sm... more
An update on Apache Hadoop 1.0
Charles Zedlewski
January 8, 2012
Excerpt: Some users & customers have asked about the most recent release of Apache Hadoop, v1.0: whats in it,... more
January 6, 2012
Excerpt: This was my summer internship project at Cloudera, and I’m very thankful for the level of support and me... more
January 6, 2012
Excerpt: This was my summer internship project at Cloudera, and I'm very thankful for the level of sup... more
Cloudera Connector for Teradata 1.0.0
Bilung Lee
January 5, 2012
Excerpt: Apache Sqoop (incubating) provides an efficient approach for transferring big data between Hadoop related sys... more
Hadoop for Archiving Email – Part 2
Sunil Sitaula
January 3, 2012
Excerpt: Part 1 of this post covered how to convert and store email messages for archival purposes using Apache Hadoop... more
Apache Hadoop for Archiving Email – Part 2
Sunil Sitaula
January 3, 2012
Excerpt: Part... more
What’s New in Apache Sqoop 1.4.0-incubating
Bilung Lee
January 2, 2012
Excerpt: This blog was originally posted on the Apache Blog . Apache Sqoop recently celebrates its first incubator... more
Apache ZooKeeper 3.4.2 has been released
Patrick Hunt
December 30, 2011
Excerpt: Apache ZooKeeper release 3.4.2 is now available. This is a bug fix release covering 2 issues, one of w... more
Apache HBase 0.90.5 is now available
Jonathan Hsieh
December 28, 2011
Excerpt: Apache HBase 0.90.5 is now available. This release of the scalable distributed data store ins... more
Apache HBase 0.90.5 is now available
Jonathan Hsieh
December 28, 2011
Excerpt: Apache HBase 0.90.5 is now available. This is release of the scalable distributed data store... more
How I found Hadoop
Omer
December 28, 2011
Excerpt: This is a guest post contributed by Loren Siebert. Loren is a San Francisco entrepreneur and software develope... more
How I found Apache Hadoop
Omer Trajman
December 28, 2011
Excerpt: This is a guest post contributed by Loren Siebert. Loren is a San Francisco entrepreneur and... more
Apache Whirr 0.7.0 has been released
Patrick Hunt
December 27, 2011
Excerpt: Apache Whirr release 0.7.0 is now available. It includes changes covering over 50 issues , four... more
Apache Whirr 0.7.0 has been released
Patrick Hunt
December 27, 2011
Excerpt: Apache Whirr release... more
Apache Avro at RichRelevance
Jon Zuanich
December 22, 2011
Excerpt: This is a guest post from RichRelevance Principal Architect and Apache Avro PMC Chair Scott Carey. In Early... more
December 21, 2011
Excerpt: This blog was originally posted on the Apache Blog: https://blogs.apache.org/flume/entry/apache_flume_hackat... more
Notes from the Flume NG Hackathon
Basier Aziz
December 21, 2011
Excerpt: This blog was originally posted on the Apache Blog:... more
My Internship at Cloudera
Jon Zuanich
December 20, 2011
Excerpt: David joined us as part of our intern program , and built the prototype for the distributed log search functi... more
Apache ZooKeeper 3.4.1 has been released
Patrick Hunt
December 19, 2011
Excerpt: Apache ZooKeeper release 3.4.1 is now available: this is a fix release covering 7 issues, 2 of which w... more
Cloudera Manager 3.7 released
Aparna Ramani
December 13, 2011
Excerpt: Aparna Ramani is the Director of Engineering for Cloudera Enterprise. Cloudera Manager 3.7, a major new ver... more
Apache Flume – Architecture of Flume NG
Arvind Prabhakar
December 9, 2011
Excerpt: This blog was originally posted on the Apache Blog: https://blogs.apache.org/flume/entry/flume_ng_architectur... more
Crunch for Dummies
Brock Noland
December 9, 2011
Excerpt: This guide is intended to be an introduction to Crunch. Introduction Crunch is used for processing data. C... more
FoneDoktor, A WibiData Application
Jon Zuanich
December 6, 2011
Excerpt: This guest blog post is from Alex Loddengaard , creator of FoneDoktor , an Android app that monitors phone u... more
Apache HBase Pow-wow Summary 11/29/2011
Jonathan Hsieh
December 2, 2011
Excerpt: San Francisco, Salesforce.com HQ - Recently there was an Apache HBase Pow-wow where project contributors gath... more
Recommendation with Apache Mahout in CDH3
Josh Patterson
November 30, 2011
Excerpt: The amount of information we are exposed to on a daily basis is far outstripping our ability to consume it, le... more
Apache ZooKeeper 3.3.4 has been released
Patrick Hunt
November 29, 2011
Excerpt: Apache ZooKeeper release 3.3.4 is now available: this is a fix release covering 22 issues , 9 of w... more
Apache ZooKeeper 3.4.0 has been released
Patrick Hunt
November 23, 2011
Excerpt: Apache ZooKeeper release 3.4.0 is now available: it includes changes covering over 150 issues , 27 of... more
Inaugural Sqoop Meetup
Kate Ting
November 23, 2011
Excerpt: This blog was originally posted on the Apache Blog: https://blogs.apache.org/sqoop/entry/inaugural_sqoo... more
Coming Attractions: Apache Hive 0.8.0
Carl Steinbach
November 17, 2011
Excerpt: The Apache Hive team is hard at work putting the finishing touches on the 0.8.0 release. While the release h... more
November 16, 2011
Excerpt: Last month at the Web 2.0 Summit in San Francisco, Cloudera CEO Mike Olson presented some work the Clou... more
Hadoop World 2011 Final Remarks
Mike Olson
November 16, 2011
Excerpt: The third annual Hadoop World conference has come and gone. The two days of conference keynotes and ses... more
Building and Deploying MR2
Jon Zuanich
November 16, 2011
Excerpt: A number of architectural changes have been added to Hadoop MapReduce. The new MapReduce system is called MR2... more
Building and Deploying MR2
Jon Zuanich
November 16, 2011
Excerpt: A number of architectural changes have been added to Hadoop MapReduce. The new MapReduce system i... more
Apache Hadoop 0.23.0 has been released
Tom White
November 15, 2011
Excerpt: The Apache Hadoop PMC has voted to release Apache Hadoop 0.23.0 . This release is significant since it is the... more
CDH3u2: Apache Mahout Integration
Linden Hillenbrand
November 3, 2011
Excerpt: Cloudera believes that the flexibility and power of Apache Mahout (http://mahout.apache.org/) in conjunct... more
October 27, 2011
Excerpt: Several meetups for Apache Hadoop and Hadoop-related projects are scheduled for the evenings surrounding Ha... more
CDH3 update 2 is released
Charles Zedlewski
October 21, 2011
Excerpt: Continuing with our practice from Clouderas Distribution Including Apache Hadoop v2 (CDH2), our goal is... more
Hadoop World 2011: A Glimpse into Operations
Jon Zuanich
October 19, 2011
Excerpt: Check out the Hadoop World 2011 conference agenda! Find sessions of interest and begin planning your Hadoop... more
October 13, 2011
Excerpt: This post was contributed by Bob Gourley, editor, CTOvision.com . The missions and data of gover... more
Hadoop World 2011: A Glimpse into Development
Jon Zuanich
October 12, 2011
Excerpt: The Development track at Hadoop World is a technical deep dive dedicated to discussion about Apache Hadoop and... more
October 10, 2011
Excerpt: As a data scientist at Cloudera, I work with customers across a wide range of industries that use Hadoop to so... more
Introducing Crunch: Easy MapReduce Pipelines for Apache Hadoop
Josh Wills (@josh_wills)
October 10, 2011
Excerpt: As a data scientist at Cloudera, I work with customers across a wide range of industries that use... more
Apache Sqoop – Overview
Arvind Prabhakar
October 6, 2011
Excerpt: This post provides a high-level overview of Apache Sqoop (incubating). It discusses the general problem addres... more
October 4, 2011
Excerpt: The Enterprise Architecture track at Hadoop World 2011 will provide insight into how Hadoop is powering tod... more
The Community Effect
Mike Olson
October 3, 2011
Excerpt: Owen O’Malley recently collected and analyzed information in the Apache Hadoop project commit logs and... more
October 3, 2011
Excerpt: This post was written by Daniel Jackoway following his internship at Cloudera during the summer of 2011. Wh... more
September 29, 2011
Excerpt: Business Solutions is a Hadoop World 2011 track geared towards business strategists and decision makers. Sess... more
Hadoop for Archiving Email
Jon Zuanich
September 28, 2011
Excerpt: This post will explore a specific use case for Apache Hadoop, one that is not commonly recognized, but is gain... more
Apache Hadoop for Archiving Email
Jon Zuanich
September 28, 2011
Excerpt: This post will explore a specific use case for Apache Hadoop, one that is not commonly recognized... more
September 27, 2011
Excerpt: The Hadoop World train is approaching the station! Remember to mark November 8 th and 9 th in your calendars... more
Hadoop Applied
Omer
September 20, 2011
Excerpt: BusinessWeek recently published a fascinating article on Hadoop and Big Data, interviewing several Cloudera... more
Apache Hadoop Applied
Omer Trajman
September 20, 2011
Excerpt: BusinessWeek recently published a fascinating... more
September 20, 2011
Excerpt: Unstructured data is the fastest growing type of data generated today. The growth rate of text, documents, ima... more
Snappy and Hadoop
Tom White
September 15, 2011
Excerpt: Snappy is a compression library developed at Google, and, like many technologies that come from Google, Snappy... more
September 13, 2011
Excerpt: Make the most of your week in New York City by combining the Hadoop World 2011 conference with training class... more
September 13, 2011
Excerpt: Make the most of your week in New York City by combining the Hadoop World 2011 conference with... more
September 7, 2011
Excerpt: Attendees of Hadoop World will receive a free copy of either Hadoop, The Definitive Guide (2nd edition)... more
Top 10 Reasons to Attend Hadoop World 2011
Jon Zuanich
August 30, 2011
Excerpt: The 3rd annual Hadoop World conference takes place on November 8th and 9th in New York City. Cloudera invites... more
August 10, 2011
Excerpt: Ari Rabkin is a summer intern at Cloudera, working with the engineering team to help make Hadoop more usable a... more
CDH3 Update 1 Released
Charles Zedlewski
July 22, 2011
Excerpt: Announcing an update to CDH3.... more
Hoop – Hadoop HDFS over HTTP
Alejandro Abdelnur
July 20, 2011
Excerpt: What is Hoop? Hoop provides access to all Hadoop Distributed File System (HDFS) operations (read and write)... more
July 13, 2011
Excerpt: This post was contributed by Michael Cafarella, an assistant professor of computer science at the University o... more
July 12, 2011
Excerpt: Pero works on research and development in new technologies for online advertising at Aol Advertising R&D... more
July 11, 2011
Excerpt: Philip Zeyliger is a software engineer at Cloudera and started the SCM project. Two weeks ago, at Hadoop S... more
Data Interoperability with Apache Avro
Doug Cutting
July 5, 2011
Excerpt: The ecosystem around Apache Hadoop has grown at a tremendous rate. Folks now can use many different pieces of... more
July 5, 2011
Excerpt: Phil Langdale is a software engineer at Cloudera and the technical lead for Clouderas SCM Express produc... more
The Only Full Lifecycle Management for Apache Hadoop: Introducing Cloudera Enterprise 3.5 and SCM Express
Jon Zuanich
July 5, 2011
Excerpt: Drew OBrien is a product marketing manager at Cloudera Were excited to share the news about the... more
June 28, 2011
Excerpt: This is a guest repost from Shopzilla’s Tech Blog written by Andrew Look, a Software Engineer at Shop... more
June 24, 2011
Excerpt: Ed Albanese leads business development for Cloudera. He is responsible for identifying new markets, revenue op... more
Reflections from Enzee Universe 2011
Jon Zuanich
June 24, 2011
Excerpt: Bala Venkatrao is the director of product management at Cloudera . I had the pleasure of attending Enzee U... more
Migrating from Elastic MapReduce to a Cloudera’s Distribution including Apache Hadoop Cluster
Jon Zuanich
June 22, 2011
Excerpt: This post was contributed by Jennie Cochran-Chinn and Joe Crobak. They are part of the team building out Adco... more
Migrating from Elastic MapReduce to a Cloudera’s Distribution including Apache Hadoop Cluster
Jon Zuanich
June 22, 2011
Excerpt: This post was contributed by Jennie Cochran-Chinn and Joe Crobak. They are part of the team b... more
June 21, 2011
Excerpt: This post was contributed by The Global Biodiversity Information Facility development team. The Global Bio... more
June 21, 2011
Excerpt: This post was contributed by The Global Biodiversity Information Facility deve... more
June 2, 2011
Excerpt: The first task is to ensure that your system is up-to-date. This procedure has been tested on the following... more
May 25, 2011
Excerpt: Take advantage of the opportunity to become a Cloudera Certified Developer or Administrator for Apache Hadoop... more
May 25, 2011
Excerpt: Take advantage of the opportunity to become a Cloudera Certified Developer or Administrator for A... more
Using Hadoop to Measure Influence
Jon Zuanich
May 15, 2011
Excerpt: Background Klout’s goal is to be the standard for influence. The advent of social media has created... more
Using Apache Hadoop to Measure Influence
Jon Zuanich
May 15, 2011
Excerpt: Background Klout's goal is to be the... more
May 13, 2011
Excerpt: This is a guest repost from the DataXu blog. Click here to view the original post. I recently evaluated... more
May 11, 2011
Excerpt: Cloudera is offering several training courses for Apache Hadoop over the dates surrounding Hadoop Summit. Th... more
An Attendee Perspective On Chicago Data Summit
Jon Zuanich
April 28, 2011
Excerpt: This is a guest post from Mike Segel, an attendee of Chicago Data Summit. Earlier this week, Cloudera hoste... more
Solve this Brain Buster for a chance to win a Doug Cutting Bobblehead at the Chicago Data Summit
Gretchen Malay
April 25, 2011
Excerpt: Do you know the answer? Many prominent projects (e.g. Hive, Pig) were sub-projects of Hadoop before becomi... more
April 20, 2011
Excerpt: I recently gave a talk at the LA Hadoop User Group about HBase Dos and Donts . The audience was... more
Apache HBase Do’s and Don’ts
Omer Trajman
April 20, 2011
Excerpt: I recently gave a talk at the LA Hadoop U... more
CDH3 goes GA
Mike Olson
April 12, 2011
Excerpt: I am very pleased to announce the general availability of Cloudera’s Distribution including Apache Hadoo... more
April 11, 2011
Excerpt: Simple Moving Average, Secondary Sort, and MapReduce (Part 3) by Josh Patterson... more
Adopting Apache Hadoop in the Federal Government
Jon Zuanich
April 5, 2011
Excerpt: Adopting Apache Hadoop in the Federal Government by Jon Zuanich April 05... more
MapIncrease
ibmwatson
April 1, 2011
Excerpt: Puny humans. SSL and Wordpress authorization will keep me out of your blog question mark. I do not think so.... more
March 30, 2011
Excerpt: London Apache Hadoop User Group Meeting Summarized by Jon Zuanich March... more
March 29, 2011
Excerpt: If you find yourself in the Chicago area later this month, please join us at the Chicago Data Summit on Apri... more
We messed up.
Mike Olson
March 25, 2011
Excerpt: We messed up. by Mike Olson March 25, 2011 no comments... more
March 23, 2011
Excerpt: Rapleaf Uses Hadoop to Efficiently Scale with Terabytes of Data by Jon Zuanich... more
March 16, 2011
Excerpt: Simple Moving Average, Secondary Sort, and MapReduce (Part 2) by Josh Patterson... more
March 14, 2011
Excerpt: Simple Moving Average, Secondary Sort, and MapReduce (Part 1) by Josh Patterson... more
March 7, 2011
Excerpt: This is the third and final post in a series detailing a recent improvement in Apache HBase that helps to redu... more
Avoiding Full GCs in Apache HBase with MemStore-Local Allocation Buffers: Part 3
Todd Lipcon (@tlipcon)
March 7, 2011
Excerpt: This is the third and final post in a series detailing a recent improvement in Apache HBase that... more
Flume Community Office Hours @ Cloudera HQ, 2/28/2011
Jonathan Hsieh
March 1, 2011
Excerpt: Flume Community Office Hours @ Cloudera HQ, 2/28/2011 by Jonathan Hsieh... more
February 28, 2011
Excerpt: This is the second post in a series detailing a recent improvement in Apache HBase that helps to reduce the fr... more
Supported Operating Systems in CDH3
Eli Collins
February 25, 2011
Excerpt: Supported Operating Systems in CDH3 by Eli Collins February 25, 2011... more
Supported Operating Systems in CDH3
Eli Collins
February 25, 2011
Excerpt: While Cloudera's Distribution including Apache Hadoop (CDH) operating system support is... more
February 25, 2011
Excerpt: Gratuitous Hadoop: Stress Testing on the Cheap with Hadoop Streaming and EC2 by Jo... more
February 24, 2011
Excerpt: Today, rather than discussing new projects or use cases built on top of CDH, I'd like to switch gears a bit an... more
Avoiding Full GCs in Apache HBase with MemStore-Local Allocation Buffers: Part 1
Todd Lipcon (@tlipcon)
February 24, 2011
Excerpt: Today, rather than discussing new projects or use cases built on top of CDH, I'd like to switch g... more
CDH3 Beta 4 Now Available
Todd Lipcon
February 22, 2011
Excerpt: CDH3 Beta 4 Now Available by Todd Lipcon February 22, 2011 1 c... more
Log Event Processing with HBase
Jon Zuanich
February 17, 2011
Excerpt: Log Event Processing with HBase by Jon Zuanich February 17, 2011... more
Log Event Processing with Apache HBase
Jon Zuanich
February 17, 2011
Excerpt: This post was authored by Dmitry Chechik, a software engineer at TellApart, the leading Custo... more
February 16, 2011
Excerpt: An emerging data management architectural pattern behind interactive web applications... more
February 16, 2011
Excerpt: The user-data connection is driving NoSQL database-Hadoop pairing... more
February 15, 2011
Excerpt: Strategies for Exploiting Large-scale Data in the Federal Government by Jon Zuanic... more
February 14, 2011
Excerpt: Cloudera in The Cube with Silicon Angle TV at Strata Conference 2011 by Jon Zuanic... more
February 11, 2011
Excerpt: Wordnik Bypasses Processing Bottleneck with Hadoop by Jon Zuanich Februa... more
February 11, 2011
Excerpt: This post is courtesy of Kumanan Rajamanikkam, Lead Engineer at... more
Hadoop Availability
Eli Collins
February 10, 2011
Excerpt: Hadoop Availability by Eli Collins February 10, 2011 1 comment... more
Apache Hadoop Availability
Eli Collins
February 10, 2011
Excerpt: A common question on the Apache Hadoop mail... more
Distributed Flume Setup With an S3 Sink
Jonathan Hsieh
February 7, 2011
Excerpt: Distributed Flume Setup With an S3 Sink by Jonathan Hsieh February 07, 2... more
Make your Hadoop voice heard!
Jon Zuanich
February 3, 2011
Excerpt: Make your Hadoop voice heard! by Jon Zuanich February 03, 2011... more
Make your Apache Hadoop voice heard!
Jon Zuanich
February 3, 2011
Excerpt: Apache Hadoop is increasingly being adopted for storage and processing of large-scale complex dat... more
Upcoming Apache Hadoop Training Sessions
Jon Zuanich
February 2, 2011
Excerpt: Upcoming Apache Hadoop Training Sessions by Jon Zuanich February 02, 201... more
Some News Related to the Apache Hadoop Project
Charles Zedlewski
February 2, 2011
Excerpt: Some News Related to the Apache Hadoop Project by Charles Zedlewski Febr... more
CDH2 Update 3 Now Available
Eli Collins
January 28, 2011
Excerpt: CDH2 Update 3 Now Available by Eli Collins January 28, 2011 1... more
January 26, 2011
Excerpt: Lessons Learned from Cloudera’s Hadoop Developer Training Course by Jon Zuan... more
Introducing Alfredo, Kerberos HTTP SPNEGO for Java
Alejandro Abdelnur
January 21, 2011
Excerpt: Introducing Alfredo, Kerberos HTTP SPNEGO for Java by Alejandro Abdelnur... more
Introducing Alfredo, Kerberos HTTP SPNEGO for Java
Alejandro Abdelnur
January 21, 2011
Excerpt: What is Kerberos & SPNEGO?... more
Top 10 Blog Posts of 2010
Jon Zuanich
January 19, 2011
Excerpt: We blogged about 104 different topics in 2010 and we recently decided to take a look back and see what folks w... more
January 17, 2011
Excerpt: Hadoop I/O: Sequence, Map, Set, Array, BloomMap Files by Jon Zuanich Jan... more
January 11, 2011
Excerpt: How to Include Third-Party Libraries in Your Map-Reduce Job by Alex Kozlov... more
January 11, 2011
Excerpt: "My library is in the classpath but I still get a Class Not Found exception in a MapReduce job" -... more
Setting up CDH3 Hadoop on my new Macbook Pro
Jon Zuanich
January 10, 2011
Excerpt: Setting up CDH3 Hadoop on my new Macbook Pro by Jon Zuanich January 10,... more
Configuring Security Features in CDH3
Jon Zuanich
January 7, 2011
Excerpt: Post written by Cloudera Software Engineer Aaron T. Myers. Apache Hadoop has had methods of doing user aut... more
Configuring Security Features in CDH3
Jon Zuanich (@jonzuanich)
January 7, 2011
Excerpt: Post written by Cloudera Software Engineer Aaron T. Myers. Apac... more
2010 Cloudera Apache Hadoop Webinars
Jon Zuanich
January 6, 2011
Excerpt: 2010 Cloudera Apache Hadoop Webinars by Jon Zuanich January 06, 2011... more
Map-Reduce With Ruby Using Apache Hadoop
Jon Zuanich
January 5, 2011
Excerpt: Map-Reduce With Ruby Using Apache Hadoop by Jon Zuanich January 05, 2011... more
New Features in Apache Pig 0.8
John Kreisa
December 21, 2010
Excerpt: New Features in Apache Pig 0.8 by John Kreisa December 21, 2010... more
December 15, 2010
Excerpt: A profile of Apache Hadoop MapReduce computing efficiency (continued) by Jon Zuani... more
December 14, 2010
Excerpt: A profile of Apache Hadoop MapReduce computing efficiency by Jon Zuanich... more
December 7, 2010
Excerpt: Cloudera and Pentaho team up to simplify data management and business intelligence... more
Lessons learned putting Hadoop into production
Jon Zuanich
December 6, 2010
Excerpt: Lessons learned putting Hadoop into production by Jon Zuanich December 0... more
Hadoop World 2010 Tweet Analysis
Jon Zuanich
December 2, 2010
Excerpt: Hadoop World 2010 Tweet Analysis by Jon Zuanich December 02, 2010... more
Hadoop Log Location and Retention
Lars George
November 29, 2010
Excerpt: Hadoop Log Location and Retention by Lars George November 29, 2010... more
Hadoop training coming to new cities in 2011
Jon Zuanich
November 24, 2010
Excerpt: Hadoop training coming to new cities in 2011 by Jon Zuanich November 24,... more
November 24, 2010
Excerpt: Due to expanding interest and demand for Apache Hadoop knowledge and skills across the mid-west a... more
November 18, 2010
Excerpt: Do the Schimmy: Efficient Large-Scale Graph Analysis with Hadoop, Part 2 by Jon Zu... more
Hadoop and HBase at RIPE NCC
Todd Lipcon
November 17, 2010
Excerpt: Hadoop and HBase at RIPE NCC by Todd Lipcon November 17, 2010... more
November 15, 2010
Excerpt: Do the Schimmy: Efficient Large-Scale Graph Analysis with Hadoop by Jon Zuanich... more
Integrating Hadoop in your Existing DW and BI Environment
Gretchen Malay
November 8, 2010
Excerpt: Integrating Hadoop in your Existing DW and BI Environment by Gretchen Malay... more
November 8, 2010
Excerpt: Organizations are looking for a cost-effective way to deal with data that are now arriving in an... more
Better Workflow Management in CDH with Oozie 2
Alejandro Abdelnur
November 4, 2010
Excerpt: Better Workflow Management in CDH with Oozie 2 by Alejandro Abdelnur Nov... more
Tackling Large Scale Data in Government
Jon Zuanich
November 2, 2010
Excerpt: Tackling Large Scale Data in Government by Jon Zuanich November 02, 2010... more
Cloudera Fun & Frightful Halloween Festivities
Jon Zuanich
November 1, 2010
Excerpt: Cloudera Fun & Frightful Halloween Festivities by Jon Zuanich Novem... more
Hadoop Lab at JavaOne
Jon Zuanich
October 26, 2010
Excerpt: Hadoop Lab at JavaOne by Jon Zuanich October 26, 2010 no comme... more
Apache Hadoop Lab at JavaOne
Jon Zuanich
October 26, 2010
Excerpt: Guest post by Daniel Templeton, Product Manager at Oracl... more
Hadoop World 2010: An Unqualified Success
Jon Zuanich
October 16, 2010
Excerpt: Hadoop World 2010: An Unqualified Success by Jon Zuanich October 16, 201... more
CDH3 beta 3 now available
Todd Lipcon
October 12, 2010
Excerpt: CDH3 beta 3 now available by Todd Lipcon October 12, 2010 no c... more
October 11, 2010
Excerpt: Hadoop: The Definitive Guide, Second Edition by Tom White October 11, 20... more
October 8, 2010
Excerpt: Afternoon Hadoop World — Possible Path Through Great Content by Jon Zuanich... more
One Possible Hadoop World Morning Path
Jon Zuanich
October 6, 2010
Excerpt: One Possible Hadoop World Morning Path by Jon Zuanich October 06, 2010... more
Hadoop World: More is better!
Gretchen Malay
September 30, 2010
Excerpt: Hadoop World: More is better! by Gretchen Malay September 30, 2010... more
Top 10 Reasons to Attend Hadoop World
Jon Zuanich
September 27, 2010
Excerpt: Top 10 Reasons to Attend Hadoop World by Jon Zuanich September 27, 2010... more
September 23, 2010
Excerpt: Twitter Analytics Lead, Kevin Weil, and a Presenter at Hadoop World Interviewed by... more
More on Cloudera Enterprise
Charles Zedlewski
September 22, 2010
Excerpt: More on Cloudera Enterprise by Charles Zedlewski September 22, 2010... more
What’s Going On Surrounding Hadoop World
Jon Zuanich
September 21, 2010
Excerpt: Whats Going On Surrounding Hadoop World by Jon Zuanich September 2... more
What is in our Kitchen?
Chad Metcalf
September 20, 2010
Excerpt: What is in our Kitchen? by Chad Metcalf September 20, 2010 no... more
Using Flume to Collect Apache 2 Web Server Logs
Jonathan Hsieh
September 17, 2010
Excerpt: Flume is a flexible, scalable, and reliable system for collecting streaming data. The Flume User... more
HUE SDK Training – NYC
Jon Zuanich
September 16, 2010
Excerpt: HUE SDK Training – NYC by Jon Zuanich September 16, 2010... more
CDH2 Update 2 Now Available
Eli Collins
September 14, 2010
Excerpt: CDH2 Update 2 Now Available by Eli Collins September 14, 2010... more
Hadoop World Presentation Track Release
Jon Zuanich
September 14, 2010
Excerpt: Hadoop World Presentation Track Release by Jon Zuanich September 14, 201... more
A Summer Internship with Cloudera
Jon Zuanich
September 10, 2010
Excerpt: A Summer Internship with Cloudera by Jon Zuanich September 10, 2010... more
September 9, 2010
Excerpt: New York Training Session for Managers Interested In Hadoop by Jon Zuanich... more
September 8, 2010
Excerpt: Flume community update: September 2010 by jon September 08, 2010... more
September 7, 2010
Excerpt: Purdue Universitys Saptarshi Guha Interviewed Regarding Hadoop, R and Hadoop World... more
A Look Back at August Posts
Jon Zuanich
September 6, 2010
Excerpt: A Look Back at August Posts by Jon Zuanich September 06, 2010... more
Tracing with Avro
Jon Zuanich
September 3, 2010
Excerpt: Tracing with Avro by Jon Zuanich September 03, 2010 no comment... more
Tracing with Apache Avro
Jon Zuanich
September 3, 2010
Excerpt: Written by Patrick Wendell, an amazing summer intern with Cloudera and an Avro Commit... more
September 2, 2010
Excerpt: Infochimp’s President, Philip Kromer, Interviewed Regarding Hadoop and Hadoop World... more
September 1, 2010
Excerpt: Register for Hadoop Training in New York and Get into Hadoop World for Free! by Jo... more
Hadoop World 2010: Speaker Highlights
Jon Zuanich
August 30, 2010
Excerpt: Hadoop World 2010: Speaker Highlights by Jon Zuanich August 30, 2010... more
What’s New in Apache Hadoop 0.21
Tom White
August 26, 2010
Excerpt: Whats New in Apache Hadoop 0.21 by Tom White August 26, 2010... more
Using Hadoop for Fraud Detection and Prevention
Alex Kozlov
August 24, 2010
Excerpt: Learn about fraud and how to prevent it with Hadoop... more
August 24, 2010
Excerpt: Fraud has multiple meanings and the term can be easily abused. The definition of fraud has unde... more
Hadoop Administrator Training Comes to London
Jon Zuanich
August 24, 2010
Excerpt: Hadoop Administrator Training Comes to London by Jon Zuanich August 24,... more
Hadoop Administrator Training Comes to London
Jon Zuanich
August 24, 2010
Excerpt: Cloudera’s... more
August 23, 2010
Excerpt: Improving Hotel Search: Hadoop @ Orbitz Worldwide by John Kreisa August... more
Hadoop World: NYC – Training
Jon Zuanich
August 19, 2010
Excerpt: Hadoop Training surrounding Hadoop World: NYC.... more
Hadoop/HBase Capacity Planning
Alex Kozlov
August 17, 2010
Excerpt: Hadoop/HBase Capacity Planning by Alex Kozlov August 17, 2010... more
Hadoop/HBase Capacity Planning
Alex Kozlov
August 17, 2010
Excerpt: Apache Hadoop and Apache HBase are gaining popularity due to their flexibility and tremendous wor... more
August 12, 2010
Excerpt: It’s easy to get started with Hadoop administration because Linux system administration is a pretty well... more
CDH3b2 Release Recap
Jeff Hammerbacher
August 11, 2010
Excerpt: CDH3b2 Release Recap by Jeff Hammerbacher August 11, 2010 no comments... more
August 10, 2010
Excerpt: Cloudera’s Henry Robinson to speak at Hadoop Day in Seattle by Huw Edwards... more
Hadoop World: early-bird rate ends on August 11
Huw Edwards
August 9, 2010
Excerpt: Hadoop World: early-bird rate ends on August 11 by Huw Edwards August 09... more
August 3, 2010
Excerpt: Flume community update – the first 30 days! by phunt August 03, 2010 no c... more
Migrating to CDH
Eric Sammer
August 2, 2010
Excerpt: With the recent release of CDH3b2 , many users are more interested than ever to try out Cloudera’s Dist... more
How to Get a Job at Cloudera
Mike Olson
July 28, 2010
Excerpt: How to Get a Job at Cloudera by Mike Olson July 28, 2010 no comments... more
Notes From the Hackathon at Cloudera
Jeff Bean
July 28, 2010
Excerpt: Notes From the Hackathon at Cloudera by Jeff Bean July 28, 2010 no comments... more
Notes From the Hackathon at Cloudera
Jeff Bean
July 28, 2010
Excerpt: I was positively blown away by the enthusiasm, creativity, and productivity exhibited by the part... more
Upcoming webinar: 10 Common Hadoop-able Problems
Huw Edwards
July 28, 2010
Excerpt: Upcoming webinar: 10 Common Hadoop-able Problems by Huw Edwards July 28, 2010 n... more
July 28, 2010
Excerpt: Announcing Two New Training Classes from Cloudera: Introduction to HBase and Analyzing Data with Hive and Pig... more
What’s New in CDH3b2: Hive
Carl Steinbach
July 22, 2010
Excerpt: What’s New in CDH3b2: Hive by Carl Steinbach July 22, 2010 no comments... more
What’s New in CDH3b2: Apache Hive
Carl Steinbach
July 22, 2010
Excerpt: CDH3 beta 2 includes Apache Hive 0.5.0, the latest v... more
Developing Applications for HUE
Aaron Newton
July 20, 2010
Excerpt: Developing Applications for HUE by Aaron Newton July 20, 2010 1 comment... more
Developing Applications for HUE
Aaron Newton
July 20, 2010
Excerpt: Yesterday's post gave an... more
What’s New in CDH3b2: HUE
BC Wong
July 19, 2010
Excerpt: The HUE (aka. Hadoop User Experience) project [... more
July 19, 2010
Excerpt: Rackspaces OpenStack shows the way for public cloud vendors by Ed Albanese July 1... more
What’s New in CDH3b2: Sqoop
Aaron Kimball
July 16, 2010
Excerpt: Whats New in CDH3b2: Sqoop by Aaron Kimball July 16, 2010 no comments... more
Hacking with Cloudera on CDH
Alex Loddengaard
July 15, 2010
Excerpt: Hacking with Cloudera on CDH by Alex Loddengaard July 15, 2010 no comments... more
What’s New in CDH3b2: Oozie
Arvind Prabhakar
July 15, 2010
Excerpt: What’s New in CDH3b2: Oozie by Arvind Prabhakar July 15, 2010 no comments... more
What’s New in CDH3b2: Pig
Carl Steinbach
July 14, 2010
Excerpt: What’s New in CDH3b2: Pig by Carl Steinbach July 14, 2010 no comments... more
What’s New in CDH3b2: Pig
Carl Steinbach
July 14, 2010
Excerpt: CDH3 beta 2 includes Apache Pig 0.7.0, the latest and... more
What’s New in CDH3b2: Flume
Henry Robinson
July 13, 2010
Excerpt: As part of our series of announcements at the recent Hadoop Summit, Cloudera released two of its previously in... more
July 12, 2010
Excerpt: CDH3 beta 2 is the first to incorporate Apache ZooKeeper. ZooKeeper is a highly reliable and available coordin... more
What’s New in CDH3b2: HBase
Todd Lipcon
July 9, 2010
Excerpt: What’s New in CDH3b2: HBase by Todd Lipcon July 09, 2010 no comments... more
What’s New in CDH3b2: Apache HBase
Todd Lipcon (@tlipcon)
July 9, 2010
Excerpt: Over the last two years, Cloudera has helped a great number of customers... more
What’s New in CDH3b2: Core Hadoop
Eli Collins
July 8, 2010
Excerpt: What’s New in CDH3b2: Core Hadoop by Eli Collins July 08, 2010 no comment... more
More on Cloudera’s Distribution including Apache Hadoop 3
Charles Zedlewski
July 7, 2010
Excerpt: More on Cloudera’s Distribution including Apache Hadoop 3 by Charles Zedlews... more
CDH3 and Cloudera Enterprise
Mike Olson
June 29, 2010
Excerpt: CDH3 and Cloudera Enterprise by Mike Olson June 29, 2010 1 com... more
June 23, 2010
Excerpt: Are your systems struggling to absorb ever-increasing amounts of data being generated daily? Are you mired in... more
June 22, 2010
Excerpt: Cloudera is once again hosting Hadoop World which will take place in New York City on Octo... more
Cloudera to participate at OSCON 2010
Huw Edwards
June 18, 2010
Excerpt: Will Cloudera be at OSCON this year? Of course, its only the premier event for OS technologies on the ma... more
June 11, 2010
Excerpt: Integrating Hive and HBase by carl June 11, 2010 no comments... more
Integrating Apache Hive and Apache HBase
Carl Steinbach
June 11, 2010
Excerpt: This post was contributed by John Sichi... more
One word more…
Mike Olson
June 10, 2010
Excerpt: One word more… by Mike Olson June 10, 2010 no comments... more
A transition
Christophe Bisciglia
June 10, 2010
Excerpt: A transition by Christophe Bisciglia June 10, 2010 no comments... more
A transition
Christophe Bisciglia
June 10, 2010
Excerpt: For an entrepreneur, it's an incredibly fulfilling experience to start companies and watch them "... more
Reporting from the UK Hadoop Users Group
John Kreisa
June 4, 2010
Excerpt: A report from the recent UK HUG from Klass Bosteels.... more
June 3, 2010
Excerpt: Considerations for Hadoop and BI (part 2 of 2) by Jeff Bean June 03, 2010 no co... more
June 3, 2010
Excerpt: Just today we heard another question about integrating Apache Hadoop with Business Intelligence t... more
June 1, 2010
Excerpt: The second Apache Hadoop HDFS and MapReduce contributors meeting was held last Friday, May 28 at ClouderaR... more
Upcoming Webinars From Cloudera
John Kreisa
May 25, 2010
Excerpt: Here at Cloudera we have deep knowledge and experience working with Hadoop and related technologies to so... more
May 21, 2010
Excerpt: Considerations for Hadoop and BI (part 1 of 2) by Jeff Bean May 21, 2010 no com... more
CDH2 Update 1 Now Available
Eli Collins
May 21, 2010
Excerpt: CDH2 Update 1 Now Available by Eli Collins May 21, 2010 no comments... more
May 7, 2010
Excerpt: Highlights from the First Hadoop Contributors Meeting by Eli Collins May 07, 2010... more
May 7, 2010
Excerpt: While the vast majority of the Hadoop development discussion takes place on... more
Exciting new Hadoop Training Offerings from Cloudera
Christophe Bisciglia
April 30, 2010
Excerpt: Around the globe, more and more companies are turning to Hadoop to tackle data processing problems that don... more
CAP Confusion: Problems with ‘partition tolerance’
Henry Robinson
April 26, 2010
Excerpt: CAP Confusion: Problems with ‘partition tolerance’ by Henry Robinson April... more
April 21, 2010
Excerpt: Get Hadoop Training from Cloudera at the Hadoop Summit by John Kreisa April 21, 2010... more
Cloudera Hadoop Training Spreads Worldwide
John Kreisa
April 13, 2010
Excerpt: Cloudera Hadoop Training Spreads Worldwide by John Kreisa April 13, 2010 no com... more
Cloudera Has Moved!
John Kreisa
April 12, 2010
Excerpt: Cloudera Has Moved! by John Kreisa April 12, 2010 1 comment... more
Scaling Social Science with Hadoop
Ed Albanese
April 5, 2010
Excerpt: Scaling Social Science with Hadoop by Ed Albanese April 05, 2010 12 comments... more
Scaling Social Science with Apache Hadoop
Ed Albanese
April 5, 2010
Excerpt: This post was contributed by researcher Scott Golder, who... more
April 1, 2010
Excerpt: Pushing the Limits of Distributed Processing by omer April 01, 2010 no comments... more
Cloudera’s Support Team Shares Some Basic Hardware Recommendations
Alex Loddengaard
March 30, 2010
Excerpt: Cloudera’s Support Team Shares Some Basic Hardware Recommendations by Alex Loddengaard... more
CDH3 Beta 1 Now Available
Eli Collins
March 24, 2010
Excerpt: It’s official – Cloudera’s Distribution for Hadoop Version 2, which we often shorthand as C... more
CDH3 Beta 1 Now Available
Eli Collins
March 24, 2010
Excerpt: It's official - Cloudera's Distribution for Hadoop Version 2, which we ofte... more
CDH2 is released
Chad Metcalf
March 24, 2010
Excerpt: We’re proud to announce that Clouderas Distribution for Hadoop Version 2 (CDH2) is officially re... more
How Raytheon BBN Technologies Researchers are Using Hadoop to Build a Scalable, Distributed Triple Store
Philip Zeyliger
March 22, 2010
Excerpt: How Raytheon BBN Technologies Researchers are Using Hadoop to Build a Scalable, Distributed Triple Store... more
HBase User Group #9: HBase and HDFS
Todd Lipcon
March 18, 2010
Excerpt: HBase User Group #9: HBase and HDFS by Todd Lipcon March 18, 2010 no comments... more
March 16, 2010
Excerpt: Natural Language Processing with Hadoop and Python by Ed Albanese March 16, 2010... more
March 10, 2010
Excerpt: Richard Hutton , CTO of nugg.ad , authored the following post about how and why his company uses Hadoop. n... more
Trip Report: Utah Java User’s Group
Philip Zeyliger
March 3, 2010
Excerpt: Trip Report: Utah Java User’s Group by Philip Zeyliger March 03, 2010 no... more
Avro 1.3.0
Matt Massie
March 1, 2010
Excerpt: Avro 1.3.0 by Matt Massie March 01, 2010 no comments Avro... more
Cloudera’s Hadoop Training Programs Expand Internationally
Christophe Bisciglia
February 22, 2010
Excerpt: Cloudera’s Hadoop Training Programs Expand Internationally by Christophe Bisciglia... more
Cloudera’s Apache Hadoop Training Programs Expand Internationally
Christophe Bisciglia
February 22, 2010
Excerpt: It's been over a year now since we started offering Hadoop training in the Bay Area, and since th... more
CDH2: “Testing” Heading Towards “Stable”
Chad Metcalf
February 18, 2010
Excerpt: CDH2: “Testing” Heading Towards “Stable” by Chad Metcalf Februa... more
Cloudera speaks VMware vCloud API, too.
Mike Olson
January 19, 2010
Excerpt: Cloudera speaks VMware vCloud API, too. by Mike Olson January 19, 2010 no comme... more
January 11, 2010
Excerpt: Hadoop World: Building Data Intensive Apps with Hadoop and EC2 by ed January 11, 2010... more
Hadoop World: Making Hadoop Easy on Amazon Web Services
Christophe Bisciglia
December 23, 2009
Excerpt: Hadoop World: Making Hadoop Easy on Amazon Web Services by Christophe Bisciglia Decembe... more
Hadoop World: Hadoop Applications at Yahoo!
Christophe Bisciglia
December 22, 2009
Excerpt: Hadoop World: Hadoop Applications at Yahoo! by Christophe Bisciglia December 22, 2009... more
7 Tips for Improving MapReduce Performance
Todd Lipcon
December 17, 2009
Excerpt: 7 Tips for Improving MapReduce Performance by Todd Lipcon December 17, 2009 no... more
Observers: Making ZooKeeper Scale Even Further
Henry Robinson
December 15, 2009
Excerpt: Observers: Making ZooKeeper Scale Even Further by Henry Robinson December 15, 2009... more
Hadoop World: Sqoop – Database Import for Hadoop
Christophe Bisciglia
December 10, 2009
Excerpt: Hadoop World: Sqoop – Database Import for Hadoop by Christophe Bisciglia December... more
Hadoop World: Security and API Compatibility
Christophe Bisciglia
December 8, 2009
Excerpt: Hadoop World: Security and API Compatibility by Christophe Bisciglia December 08, 2009... more
Hadoop World: Security and API Compatibility
Christophe Bisciglia
December 8, 2009
Excerpt: Today's Hadoop World talk comes from Owen O'Malley and talks about some of the biggest challenges fa... more
Hadoop World: Hadoop for Bioinformatics
Christophe Bisciglia
December 2, 2009
Excerpt: Hadoop World: Hadoop for Bioinformatics by Christophe Bisciglia December 02, 2009... more
Hadoop World: Practical HBase from Jonathan Gray and Ryan Rawson
Alex Loddengaard
November 25, 2009
Excerpt: Hadoop World: Practical HBase from Jonathan Gray and Ryan Rawson by Alex Loddengaard No... more
Hadoop World: Hadoop + Vertica from Omer Trajman
Alex Loddengaard
November 23, 2009
Excerpt: Hadoop World: Hadoop + Vertica from Omer Trajman by Alex Loddengaard November 23, 2009... more
Hadoop World: Hadoop + Clojure from Stuart Sierra and Tim Dysinger
Alex Loddengaard
November 20, 2009
Excerpt: Hadoop World: Hadoop + Clojure from Stuart Sierra and Tim Dysinger by Alex Loddengaard... more
Hadoop World: Protein Alignment from Paul Brown
Alex Loddengaard
November 19, 2009
Excerpt: Hadoop World: Protein Alignment from Paul Brown by Alex Loddengaard November 19, 2009... more
November 17, 2009
Excerpt: Hadoop at Twitter (part 1): Splittable LZO Compression by Matt Massie November 17, 2009... more
Hadoop World: Rethinking the Data Warehouse with Hadoop and Hive from Ashish Thusoo
Christophe Bisciglia
November 11, 2009
Excerpt: Hadoop World: Rethinking the Data Warehouse with Hadoop and Hive from Ashish Thusoo by Christop... more
Hadoop World: Monitoring Best Practices from Ed Capriolo
Christophe Bisciglia
November 9, 2009
Excerpt: Today’s Hadoop World video comes from Ed Capriolo, and goes into details about how to effectively monito... more
Avro: a New Format for Data Interchange
Doug Cutting
November 2, 2009
Excerpt: Avro is a recent addition to Apache's Hadoop family of projects. Avro defines a data format designed to supp... more
Apache Avro: a New Format for Data Interchange
Doug Cutting
November 2, 2009
Excerpt: Apache Avro is a recent addition to Apache's... more
Hadoop World: NYC – Let the Videos Roll
Christophe Bisciglia
October 29, 2009
Excerpt: Hadoop World: NYC – Let the Videos Roll by Christophe Bisciglia October 29, 2009... more
Apache Hadoop Get-Together in Berlin – Videos Online
Christophe Bisciglia
October 21, 2009
Excerpt: Around the world, individuals contribute to Hadoop and build community around the technology. This kind of col... more
Cloudera Desktop and MooTools
Aaron Newton
October 19, 2009
Excerpt: Cloudera Desktop and MooTools by Aaron Newton October 19, 2009 7 comments... more
Analyzing Human Genomes with Hadoop
Christophe Bisciglia
October 15, 2009
Excerpt: Analyzing Human Genomes with Hadoop by Christophe Bisciglia October 15, 2009 4... more
Analyzing Human Genomes with Apache Hadoop
Christophe Bisciglia
October 15, 2009
Excerpt: Every day, we hear about people doing amazing things with Apache Hadoop. The va... more
Introducing Cloudera Desktop
Jeff Hammerbacher
October 1, 2009
Excerpt: Today at Hadoop World NYC , we’re announcing the availability of Cloudera Desktop , a unified an... more
CDH2: Testing Release now with Pig, Hive, and HBase
Chad Metcalf
September 30, 2009
Excerpt: At the beginning of September, we announced the first release of CDH2 , our current testing repository. Pac... more
HBase Available in CDH2
Chad Metcalf
September 29, 2009
Excerpt: One of the more common requests we receive from the community is to package HBase with Cloudera’s Distri... more
Apache HBase Available in CDH2
Chad Metcalf
September 29, 2009
Excerpt: One of the more common requests we receive from the community is to package Apa... more
Grouping Related Trends with Hadoop and Hive
Amr Awadallah
September 28, 2009
Excerpt: Grouping Related Trends with Hadoop and Hive by Amr Awadallah September 28, 2009... more
September 15, 2009
Excerpt: Apache Hadoop Log Files: Where to find them in CDH, and what info they contain by Alex Loddenga... more
CDH2: Cloudera’s Distribution for Hadoop 2
Matt Massie
September 10, 2009
Excerpt: In March of this year, we released our distribution for Hadoop. Our initial focus was on stability and m... more
September 10, 2009
Excerpt: In March of this year, we released our distribution for Apache Hadoop. Our initial focus was on... more
Hadoop World: NYC 2009: Speakers Announced
Christophe Bisciglia
September 9, 2009
Excerpt: It’s been a crazy few weeks here at Cloudera, and while there is no sign of things letting up before Ha... more
Hadoop World: NYC 2009
Christophe Bisciglia
August 19, 2009
Excerpt: To say we were surprised by the quality and quantity of submissions we received for Hadoop World: NYC 2009... more
Hadoop Default Ports Quick Reference
Philip Zeyliger
August 14, 2009
Excerpt: Hadoop Default Ports Quick Reference by Philip Zeyliger August 14, 2009... more
Doug Cutting joins Cloudera
Mike Olson
August 10, 2009
Excerpt: Back in October, I promised to keep marketing and sales out of this blog. We wanted to concentrate on techni... more
Tracking Trends with Hadoop and Hive on EC2
Amr Awadallah
July 31, 2009
Excerpt: Tracking Trends with Hadoop and Hive on EC2 by Amr Awadallah July 31, 2009 8 co... more
Advice on QA Testing Your MapReduce Jobs
Alex Loddengaard
July 29, 2009
Excerpt: As Hadoop adoption increases among organizations, companies, and individuals, and as it makes its way into pro... more
Running the Cloudera Training VM in VirtualBox
Christophe Bisciglia
July 27, 2009
Excerpt: Cloudera’s Training VM is one of the most popular resources on our website. It was created with VMware W... more
Running the Cloudera Training VM in VirtualBox
Christophe Bisciglia
July 27, 2009
Excerpt: Update (May 1 2013): The post below, which is based on an outdated VM, is deprecated. Rat... more
Hadoop HA Configuration
Christophe Bisciglia
July 22, 2009
Excerpt: One of the things we get a lot of questions about is how to make Hadoop highly available. There is still a lot... more
Apache Hadoop HA Configuration
Christophe Bisciglia
July 22, 2009
Excerpt: Disclaimer: Cloudera no longer approves of the recommendations in this post. Ple... more
The Project Split
Aaron Kimball
July 17, 2009
Excerpt: Last Wednesday, we hosted a Hadoop meetup, and I gave a short talk about the new project split. How does the s... more
File Appends in HDFS
Tom White
July 17, 2009
Excerpt: There is some confusion about the state of the file append operation in HDFS. It was in, now it’s out. W... more
Hadoop Graphing with Cacti
Christophe Bisciglia
July 7, 2009
Excerpt: An important part of making sure Hadoop works well for all users is developing and maintaining strong relation... more
Hadoop Graphing with Cacti
Christophe Bisciglia
July 7, 2009
Excerpt: An important part of making sure Apache Hadoop works well for all users is deve... more
Debugging MapReduce Programs With MRUnit
Aaron Kimball
July 3, 2009
Excerpt: The distributed nature of MapReduce programs makes debugging a challenge. Attaching a debugger to a remote pro... more
Rackspace Upgrades to Cloudera’s Distribution for Hadoop
Christophe Bisciglia
June 30, 2009
Excerpt: Hadoop moves fast. Users often find that they need to upgrade after just a few months. Upgrading can be a daun... more
Rackspace Upgrades to Cloudera’s Distribution for Apache Hadoop
Christophe Bisciglia
June 30, 2009
Excerpt: Apache Hadoop moves fast. Users often find that they need to upgrade after ju... more
Parallel LZO: Splittable Compression for Hadoop
Christophe Bisciglia
June 24, 2009
Excerpt: Yesterday, Chris Goffinet from Digg made a great blog post about LZO and Hadoop. Many users have been frustr... more
Parallel LZO: Splittable Compression for Apache Hadoop
Christophe Bisciglia
June 24, 2009
Excerpt: Yesterday, Chris Goffinet from Digg made a great... more
A Great Week for Hadoop: Summit Roundup
Christophe Bisciglia
June 22, 2009
Excerpt: On June 10th, more than 750 people from around the world descended on the Santa Clara Marriott to share their... more
A Great Week for Apache Hadoop: Summit Roundup
Christophe Bisciglia
June 22, 2009
Excerpt: On June 10th, more than 750 people from around the world descended on the Santa Clara Marriott to... more
Analyzing Apache logs with Pig
Amr Awadallah
June 17, 2009
Excerpt: Analyzing Apache logs with Pig by Amr Awadallah June 17, 2009 5 comments... more
Analyzing Apache logs with Apache Pig
Amr Awadallah
June 17, 2009
Excerpt: (guest blog post by Dmitriy Rya... more
The Smart Grid: Hadoop at the Tennessee Valley Authority (TVA)
Christophe Bisciglia
June 2, 2009
Excerpt: For the last few months, we’ve been working with the TVA to help them manage hundreds of TB of data from... more
Introducing Sqoop
Aaron Kimball
June 1, 2009
Excerpt: In addition to providing you with a dependable release of Hadoop that is easy to configure , at Cloudera we... more
Common Questions and Requests From Our Users
Alex Loddengaard
May 29, 2009
Excerpt: A few months ago we announced the Cloudera Distribution for Hadoop . We’re happy to report that l... more
May 28, 2009
Excerpt: In my first few weeks here at Cloudera , I’ve been tasked with helping out with the Apache ZooKeeper... more
Announcing Cloudera Certification for Hadoop
Christophe Bisciglia
May 28, 2009
Excerpt: As Hadoop continues to turn heads at startups and big enterprises alike, Cloudera has received several request... more
Announcing Cloudera Certification for Apache Hadoop
Christophe Bisciglia
May 28, 2009
Excerpt: As Apache Hadoop continues to turn heads at startups and big enterprises alike, Cloudera has rece... more
Announcing Hadoop World: NYC 2009: RFP Open
Christophe Bisciglia
May 27, 2009
Excerpt: Lately, we’ve been spending a lot of time on the East Coast, and one thing is clear: Hadoop is everywher... more
Protecting per-DataNode Metadata
Aaron Kimball
May 22, 2009
Excerpt: Administrators of HDFS clusters understand that the HDFS metadata is some of the most precious bits they have.... more
Protecting per-DataNode Metadata
Aaron Kimball
May 22, 2009
Excerpt: Administrators of HDFS clusters understand that the HDFS metadata is some of the most precious... more
10 MapReduce Tips
Tom White
May 18, 2009
Excerpt: This piece is based on the talk Practical MapReduce that I gave at Hadoop User Group UK on April... more
5 Common Questions About Hadoop
Christophe Bisciglia
May 14, 2009
Excerpt: 5 Common Questions About Hadoop by Christophe Bisciglia May 14, 2009 11 comment... more
5 Common Questions About Apache Hadoop
Christophe Bisciglia
May 14, 2009
Excerpt: There’s been a lot of buzz about Apache Hadoop lately. Just the other day, some of our friends... more
Using Cloudera’s Hadoop AMIs to process EBS datasets on EC2
Christophe Bisciglia
May 11, 2009
Excerpt: A while back, we noticed a blog post From Arun Jacob over at Evri (if you haven’t seen Evri before,... more
What’s New in Hadoop Core 0.20
Tom White
May 7, 2009
Excerpt: Whats New in Hadoop Core 0.20 by Tom White May 07, 2009... more
High Energy Hadoop
Matt Massie
May 1, 2009
Excerpt: We asked Brian Bockelman, a Post Doc Research Associate in the Computer Science & Engineering Depar... more
Debian packages for Hadoop
Todd Lipcon
April 27, 2009
Excerpt: When we announced Cloudera’s Distribution for Hadoop last month, we asked the community to give us fe... more
Debian packages for Apache Hadoop
Todd Lipcon (@tlipcon)
April 27, 2009
Excerpt: When we announced Cloudera's Distribution for Apache Had... more
Pig Training Now Available Online
Christophe Bisciglia
April 23, 2009
Excerpt: Today I did a web search for “pig training” using my favorite search engine. I was wildly entertai... more
Apache Pig Training Now Available Online
Christophe Bisciglia
April 23, 2009
Excerpt: Today I did a web search for "pig training" using my favorite search engine. I was wildly enterta... more
Using Hadoop to Annotate Billions of Web Documents with Semantics
Christophe Bisciglia
April 22, 2009
Excerpt: Welcome to the first guest post on the Cloudera blog. The other day, we saw Toby from Swingly tweet... more
Using Apache Hadoop to Annotate Billions of Web Documents with Semantics
Christophe Bisciglia
April 22, 2009
Excerpt: Welcome to the first guest post on the Cloudera blog. The other day, we saw Toby from ... more
The Second Hadoop UK User Group Meeting
Henry Robinson
April 21, 2009
Excerpt: Last Tuesday – on my second day of work at Cloudera – I went to London to check out the second UK... more
Configuring Eclipse for Hadoop Development (a screencast)
Philip Zeyliger
April 20, 2009
Excerpt: One of the perks of using Java is the availability of functional, cross-platform IDEs. I use vim for m... more
Configuring Eclipse for Apache Hadoop Development (a screencast)
Philip Zeyliger
April 20, 2009
Excerpt: Update (added 5/15/2013): The information below is a bit dated; see... more
Hive and JobTracker Needed Logos…
Aaron Newton
April 15, 2009
Excerpt: In the process of working on a few things here I wanted to add some links to launch Hive and the Hadoop Jobt... more
Hive and JobTracker Needed Logos…
Aaron Newton
April 15, 2009
Excerpt: In the process of working on a few things here I wanted to add some links to launch... more
April 9, 2009
Excerpt: A few weeks ago we announced Cloudera’s Distribution for Hadoop , and I want to spend some time showing... more
Upcoming Functionality in “Fair Scheduler 2.0″
Amr Awadallah
April 3, 2009
Excerpt: Upcoming Functionality in “Fair Scheduler 2.0″ by Amr Awadallah April 03, 2... more
Configuration Parameters: What can you just ignore?
Aaron Kimball
March 30, 2009
Excerpt: Configuring a Hadoop cluster is something akin to voodoo. There are a large number of variables in hadoop-def... more
Announcing Cloudera’s Distribution for Hadoop
Christophe Bisciglia
March 15, 2009
Excerpt: One of the repeating themes we have heard while working with our customers and the community is that Hadoop co... more
Announcing Cloudera’s Distribution for Apache Hadoop
Christophe Bisciglia
March 15, 2009
Excerpt: One of the repeating themes we have heard while working with our customers and the community is t... more
Cloudera’s Basic Hadoop Training: Now Free Online
Christophe Bisciglia
March 13, 2009
Excerpt: Exciting news: We’re providing our basic hadoop training for free online . We’ll still... more
Hadoop Metrics
Philip Zeyliger
March 12, 2009
Excerpt: Hadoop’s NameNode, SecondaryNameNode, DataNode, JobTracker, and TaskTracker daemons all expose runtime m... more
Database Access with Hadoop
Aaron Kimball
March 6, 2009
Excerpt: Hadoop’s strength is that it enables ad-hoc analysis of unstructured or semi-structured data. Relational... more
Database Access with Apache Hadoop
Aaron Kimball
March 6, 2009
Excerpt: Apache Hadoop's strength is that it enables ad-hoc analysis of unstructured or semi-structured da... more
Multi-host SecondaryNameNode Configuration
Aaron Kimball
February 10, 2009
Excerpt: You might think that the SecondaryNameNode is a hot backup daemon for the NameNode. You’d be wrong. The... more
The Small Files Problem
Tom White
February 2, 2009
Excerpt: Small files are a big problem in Hadoop — or, at least, they are if the number of questions on the user... more
HDFS Reliability
Tom White
January 14, 2009
Excerpt: HDFS Reliability by Tom White January 14, 2009 4 comments... more
State of the Elephant 2008
Tom White
January 5, 2009
Excerpt: It’s a new year, the time when we take a moment to look back at the previous one, and forward to what mi... more
What’s New in Hadoop Core 0.19
Tom White
December 31, 2008
Excerpt: The first release (0.19.0) from the 0.19 branch of Hadoop Core was made on November 24. Many changes go into... more
What’s New in Hadoop Core 0.19
Tom White
December 31, 2008
Excerpt: The first release (0.19.0) from the 0.19 branch of Apache ... more
Testing Hadoop
Tom White
December 16, 2008
Excerpt: As a developer coming to Hadoop it is important to understand how testing is organized in the project. For the... more
Testing Apache Hadoop
Tom White
December 16, 2008
Excerpt: As a developer coming to Apache Hadoop it is important to understand how testing is organized in... more
Securing a Hadoop Cluster Through a Gateway
Aaron Kimball
December 3, 2008
Excerpt: A few weeks ago we ran a Hadoop hackathon. ApacheCon participants were invited to use our 10-node Hadoop clust... more
Securing an Apache Hadoop Cluster Through a Gateway
Aaron Kimball
December 3, 2008
Excerpt: A few weeks ago we ran an Apache Hadoop hackathon. ApacheCon participants were invited to use our... more
Job Scheduling in Hadoop
Amr Awadallah
November 23, 2008
Excerpt: Job Scheduling in Hadoop by Amr Awadallah November 23, 2008 3 comments... more
Job Scheduling in Apache Hadoop
Amr Awadallah
November 23, 2008
Excerpt: (guest blog post by... more
Introducing Hadoop Development Status
Alex Loddengaard
November 18, 2008
Excerpt: Introducing Hadoop Development Status by Alex Loddengaard November 18, 2008 no... more
Sending Files to Remote Task Nodes with Hadoop MapReduce
Jeff Hammerbacher
November 14, 2008
Excerpt: It is common for a MapReduce program to require one or more files to be read by each map or reduce task before... more
Configuring and Using Scribe for Hadoop Log Collection
Alex Loddengaard
November 2, 2008
Excerpt: As promised in my post about installing Scribe for log collection , I’m going to cover how to configure... more
Installing Scribe For Log Collection
Alex Loddengaard
October 28, 2008
Excerpt: Scribe is a newly released log collection tool that dumps log files from various nodes in a cluster to Scri... more
October 24, 2008
Excerpt: Apache Hadoop exists within a rich ecosystem of tools for processing and analyzing large data sets. At Facebo... more
Welcome to Cloudera’s Hadoop blog!
Mike Olson
October 23, 2008
Excerpt: We’ve created this blog as a place to post tips, tricks and insights on using Hadoop and related project... more