Tag Archives: open source

Meet Cloudera’s Apache Spark Committers

Categories: Community General Meet the Engineer Spark

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready.

Recently,

Read more

How Impala Scales for Business Intelligence: New Test Results

Categories: Impala Performance

Recent Impala testing demonstrates its scalability to a large number of concurrent users. 

Impala, the open source MPP query engine designed for high-concurrency SQL over Apache Hadoop, has seen tremendous adoption across enterprises in industries such as financial services, telecom, healthcare, retail, gaming, government, and advertising. Impala has unlocked the ability to use business intelligence (BI) applications on Hadoop; these applications support critical business needs such as data discovery,

Read more

YCSB, the Open Standard for NoSQL Benchmarking, Joins Cloudera Labs

Categories: Cloudera Labs HBase Performance

YCSB, the open standard for comparative performance evaluation of data stores, is now available to CDH users for their Apache HBase deployments via new packages from Cloudera Labs.

Many factors go into deciding which data store should be used for production applications, including basic features, data model, and the performance characteristics for a given type of workload. It’s critical to have the ability to compare multiple data stores intelligently and objectively so that you can make sound architectural decisions.

Read more

How-to: Write a Cloud Provider Plugin for Cloudera Director

Categories: Cloud How-to

Cloudera Director 1.5 introduces a new plugin architecture to enable support for additional cloud providers. If you want to implement a plugin to add integration with a cloud provider that is not supported out-of-the-box, or to extend one of the existing plugins, these details will get you started.

As discussed in our previous blog post, the Cloudera Director Service Provider Interface (Cloudera Director SPI) defines a Java interface and packaging standards for Cloudera Director plugins.

Read more

Getting Started with Ibis and How to Contribute

Categories: Cloudera Labs Impala

Learn about the architecture of Ibis, the roadmaps for Ibis and Impala, and how to get started and contribute.

We created Ibis, a new Python data analysis framework now incubating in Cloudera Labs, with the goal of enabling data scientists and data engineers to be as productive working with big data as they are working with small and medium data today. In doing so, we will enable Python to become a true first-class language for Apache Hadoop,

Read more