Category Archives: CDH

Seismic Data Science: Reflection Seismology and Hadoop

Categories: CDH General Hadoop Use Case

When most people first hear about data science, it’s usually in the context of how prominent web companies work with very large data sets in order to predict clickthrough rates, make personalized recommendations, or analyze UI experiments. The solutions to these problems require expertise with statistics and machine learning, and so there is a general perception that data science is intimately tied to these fields. However, in my conversations at academic┬áconferences and with Cloudera customers,

Read more

Oracle selects CDH and Cloudera Manager as the Apache Hadoop Platform for the Oracle Big Data Appliance

Categories: CDH Community General

Cloudera users gain more choice, tighter Oracle integration. Cloudera partners gain increased validation of their platform choice.

Ed Albanese
Ed leads business development for Cloudera. He is responsible for identifying new markets, revenue opportunities and strategic alliances for the company.

Summary: Oracle has selected Cloudera’s Distribution Including Apache Hadoop (CDH) and Cloudera Manager software as core technologies on the Oracle Big Data Appliance,

Read more

SCM Express: Now Anyone Can Experience the Power of Apache Hadoop

Categories: CDH General

Phil Langdale is a software engineer at Cloudera and the technical lead for Cloudera’s SCM Express product.

What is SCM Express?


As powerful and useful as Apache Hadoop is, anyone who has setup up a cluster from scratch is well aware of how challenging it can be: every machine has to have the right packages installed and correctly configured so that they can all work together,

Read more

If 80% of data is unstructured, is it the exception or a new rule?

Categories: CDH Community

Ed Albanese leads business development for Cloudera. He is responsible for identifying new markets, revenue opportunities and strategic alliances for the company.

This week’s announcement about the availability of the Cloudera Connector for IBM Netezza is the achievement of a major milestone, but not necessarily the one you might expect. It’s not just the delivery of a useful software component; it’s also the introduction of a new generation of data management architectures. 

Read more