Tag Archives: development

YCSB, the Open Standard for NoSQL Benchmarking, Joins Cloudera Labs

Categories: Cloudera Labs HBase Performance

YCSB, the open standard for comparative performance evaluation of data stores, is now available to CDH users for their Apache HBase deployments via new packages from Cloudera Labs.

Many factors go into deciding which data store should be used for production applications, including basic features, data model, and the performance characteristics for a given type of workload. It’s critical to have the ability to compare multiple data stores intelligently and objectively so that you can make sound architectural decisions.

Read more

Community Meetups at Strata + Hadoop World NYC 2015

Categories: Community Events

Strata + Hadoop World 2015 NYC is more than a daytime conference; it’s also a nighttime meetup experience. (Plus, there are a bunch of book signings.)

It won’t be long before we’re all in NYC for Strata + Hadoop World (Sept. 29-Oct. 1; if you haven’t registered yet, a 20% discount is still available). So, consider for your evening agenda:

Read more

How-to: Write a Cloud Provider Plugin for Cloudera Director

Categories: Cloud How-to

Cloudera Director 1.5 introduces a new plugin architecture to enable support for additional cloud providers. If you want to implement a plugin to add integration with a cloud provider that is not supported out-of-the-box, or to extend one of the existing plugins, these details will get you started.

As discussed in our previous blog post, the Cloudera Director Service Provider Interface (Cloudera Director SPI) defines a Java interface and packaging standards for Cloudera Director plugins.

Read more

How Apache Spark, Scala, and Functional Programming Made Hard Problems Easy at Barclays

Categories: Guest Spark Use Case

Thanks to Barclays employees Sam Savage, VP Data Science, and Harry Powell, Head of Advanced Analytics, for the guest post below about the Barclays use case for Apache Spark and its Scala API.

At Barclays, our team recently built an application called Insights Engine to execute an arbitrary number N of near-arbitrary SQL-like queries and execute them in a way that can scale with increasing N. The queries were non-trivial,

Read more

Getting Started with Ibis and How to Contribute

Categories: Cloudera Labs Impala

Learn about the architecture of Ibis, the roadmaps for Ibis and Impala, and how to get started and contribute.

We created Ibis, a new Python data analysis framework now incubating in Cloudera Labs, with the goal of enabling data scientists and data engineers to be as productive working with big data as they are working with small and medium data today. In doing so, we will enable Python to become a true first-class language for Apache Hadoop,

Read more