Category Archives: Data Science

Large-Scale Health Data Analytics with OHDSI

Categories: CDH Data Science

Data analytics is increasingly being brought to bear to treat human disease, but as more and more health data is stored in computer databases, one significant challenge is how to perform analyses across these disparate databases. In this post I take a look at the Observational Health Data Sciences and Informatics (or OHDSI, pronounced “Odyssey”) program that was formed to address this challenge, and which today accounts for 1.26 billion patient records collectively stored across 64 databases in 17 countries.

Read more

New in Cloudera Data Science Workbench 1.2: Usage Monitoring for Administrators

Categories: CDH Cloudera Data Science Workbench Data Science Performance

Cloudera Data Science Workbench (CDSW) provides data science teams with a self-service platform for quickly developing machine learning workloads in their preferred language, with secure access to enterprise data and simple provisioning of compute. Individuals can request schedulable resources (e.g. compute, memory, GPUs) on a shared cluster that is managed centrally.

While self-service provisioning of resources is critical to the rapid interaction cycle of data scientists, it can pose a challenge to administrators.

Read more

Deep learning with Apache MXNet on Cloudera Data Science Workbench

Categories: CDH Cloudera Data Science Workbench Data Science

With the abundance of deep learning frameworks available today, it can be difficult to know what to choose for any particular application. Given the contrasting strengths and weaknesses of these frameworks, the ability to work with and switch between more than one is particularly important. Recent Cloudera blogs have shown how examples of applying deep learning on the Cloudera ecosystem using popular frameworks Deeplearning4j, BigDL, and Keras+TensorFlow.

Read more

Understanding how Deep Learning learns to play SET®

Categories: CDH Cloudera Data Science Workbench Data Science

In the past few years, deep learning has seen incredible success in image recognition applications. In this post I examine how to train a convolutional neural network to recognize playing card images from a game called SET®, explore the structure of the model to get some insight into what it is “seeing”, and present a webcam application that uses the deployed model in a near-realtime setting.

SET is a card game where the objective is to find triples of cards,

Read more

How To Predict ICU Mortality with Digital Health Data, DL4J, Apache Spark and Cloudera

Categories: CDH Data Science Spark

Modeling EHR Data in Healthcare

In this case study, we take a look at modeling electronic health record (EHR) data with deep learning and Deeplearning4j (DL4J). We draw inspiration from recent research showing that carefully designed neural network architectures can learn effectively from the complex, messy data collected in EHRs. Specifically, we describe how to train an  long short-term memory recurrent neural network (LSTM RNN) to predict in-hospital mortality among patients hospitalized in the intensive care unit (ICU).

Read more