Aki Ariga, Author at Cloudera Blog

April 26, 2017 | Technical

Use your favorite Python library on PySpark cluster with Cloudera Data Science Workbench

Cloudera Data Science Workbench provides freedom for data scientists. It gives them the flexibility to work with their favorite libraries using isolated environments with a container for each project. In JVM world such as Java or Scala, using your favorite packages on a Spark cluster is easy. Each application manages preferred packages using fat JARs, […]

by Aki Ariga 3 min read

February 6, 2017 | Technical

Analyzing US flight data on Amazon S3 with sparklyr and Apache Spark 2.0

We posted several blog posts about sparklyr (introduction, automation), which enables you to analyze big data leveraging Apache Spark seamlessly with R. sparklyr, developed by RStudio, is an R interface to Spark that allows users to use Spark as the backend for dplyr, which is the popular data manipulation package for R. If you are […]

by Aki Ariga 6 min read

Apache Hadoop Apache Spark Cloudera Data Science Workbench Cloudera Enterprise

More by this author:

Use your favorite Python library on PySpark cluster with Cloudera Data Science Workbench

Analyzing US flight data on Amazon S3 with sparklyr and Apache Spark 2.0