Tag Archives: Data Science

Visual Model Interpretability for Telco Churn in Cloudera Data Science Workbench

Categories: CDH Cloudera Data Science Workbench Fast Forward Labs Spark

Disclaimer: the scenario below is hypothetical.   Any similarity to any specific telecommunications company is purely coincidental.  

Although we use the example of a telecommunications company the following applies to every organization with customers or voluntary stakeholders.  

Introduction

Imagine that you are a Chief Data Officer at a major telecommunications provider and the CEO has asked you to overhaul the existing customer churn analytics.  The current process relies on manual export of data from dozens of data sources including ERP,

Read more

A Technical Overview of Cloudera Altus Analytic DB

Categories: Altus Analytic Database Cloud

A few weeks back, we announced the upcoming beta of Cloudera Altus Analytic DB for cloud-based data warehousing. As promised, the beta is now available and we wanted to spend some time describing the unique architecture.

Architecture of Cloudera Altus Analytic DB

Altus Analytic DB is built on the Cloudera Altus platform-as-a-service foundation, which also supports the Altus Data Engineering service. The architecture of Cloudera Altus is based around a few simple but important premises —

Read more

Deploy Cloudera EDH Clusters Like a Boss Revamped – Part 2

Categories: CDH Hadoop HDFS

In Part 1: Infrastructure Considerations in this three part revamped series on deploying clusters like a boss, we provided a general explanation for how nodes are classified, disk layout configurations and network topologies to think about when deploying your clusters.

In this Part 2: Service and Role Layouts segment of the series, we take a step higher up the stack looking at the various services and roles that make up your Cloudera Enterprise deployment.  

Read more

Large-Scale Health Data Analytics with OHDSI

Categories: CDH Data Science

Data analytics is increasingly being brought to bear to treat human disease, but as more and more health data is stored in computer databases, one significant challenge is how to perform analyses across these disparate databases. In this post I take a look at the Observational Health Data Sciences and Informatics (or OHDSI, pronounced “Odyssey”) program that was formed to address this challenge, and which today accounts for 1.26 billion patient records collectively stored across 64 databases in 17 countries.

Read more