Tag Archives: developers

What’s New in Cloudera Director 1.5?

Categories: Cloud

Cloudera Director 1.5 is now available; this post describes what’s inside, including a new open source plugin interface.

Cloudera Director is the manifestation of Cloudera’s commitment to providing a simple and reliable way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. With Cloudera Director 1.5, we continue the story of enabling production-ready clusters and big data applications by focusing on the following themes.

Read More

Getting Started with Ibis and How to Contribute

Categories: Cloudera Labs Impala

Learn about the architecture of Ibis, the roadmaps for Ibis and Impala, and how to get started and contribute.

We created Ibis, a new Python data analysis framework now incubating in Cloudera Labs, with the goal of enabling data scientists and data engineers to be as productive working with big data as they are working with small and medium data today. In doing so, we will enable Python to become a true first-class language for Apache Hadoop,

Read More

Ibis on Impala: Python at Scale for Data Science

Categories: Cloudera Labs Data Science Impala

This new Cloudera Labs project promises to deliver the great Python user experience and ecosystem at Hadoop scale.

Across the user community, you will find general agreement that the Apache Hadoop stack has progressed dramatically in just the past few years. For example, Search and Impala have moved Hadoop beyond batch processing, while developers are seeing significant productivity gains and additional use cases by transitioning from MapReduce to Apache Spark.

Thanks to such advances in the ecosystem,

Read More

How-to: Tune MapReduce Parallelism in Apache Pig Jobs

Categories: Guest How-to Pig

Thanks to Wuheng Luo, a Hadoop and big data architect at Sears Holdings, for the guest post below about Pig job-level performance tuning

Many factors can affect Apache Pig job performance in Apache Hadoop, including hardware, network I/O, cluster settings, code logic, and algorithm. Although the sysadmin team is responsible for monitoring many of these factors, there are other issues that MapReduce job owners or data application developers can help diagnose,

Read More

Call for Demos: Developer Showcase at Strata + Hadoop World NYC 2015

Categories: Community Events General

Strata + Hadoop World New York 2015 needs your developer demos! The proposal period closes on Aug. 14.

As everyone knows, Apache Hadoop’s overwhelming success is partly premised on de-centralized innovation from all corners of the community—users, vendors, and academia—with everyone participating on a level playing field. And since 2011, Strata + Hadoop World has been a community and content hub of that ecosystem.

For the 2015 show in New York (Sept.

Read More