Category Archives: General

Call for Demos: Developer Showcase at Strata + Hadoop World NYC 2015

Categories: Community Events General

Strata + Hadoop World New York 2015 needs your developer demos! The proposal period closes on Aug. 14.

As everyone knows, Apache Hadoop’s overwhelming success is partly premised on de-centralized innovation from all corners of the community—users, vendors, and academia—with everyone participating on a level playing field. And since 2011, Strata + Hadoop World has been a community and content hub of that ecosystem.

For the 2015 show in New York (Sept.

Read more

Got SQL? Xplain.io Joins Cloudera

Categories: General

Xplain.io is now part of Cloudera. 

Fifteen months ago, Rituparna Agrawal and I incorporated Xplain.io in a small shed in my backyard. With intense focus on solving real customer problems, we built an eclectic and diverse team with skills across database internals, distributed systems, and customer-centric design.

Throughout Q4 2013, we interviewed more than 60 enterprise data architects and found that they were all overwhelmed with the choices available in modern data management.

Read more

BigBench: Toward An Industry-Standard Benchmark for Big Data Analytics

Categories: General Hardware Performance

Learn about BigBench, the new industrywide effort to create a sorely needed Big Data benchmark.

Benchmarking Big Data systems is an open problem. To address this concern, numerous hardware and software vendors are working together to create a comprehensive end-to-end big data benchmark suite called BigBench. BigBench builds upon and borrows elements from existing benchmarking efforts in the Big Data space (such as YCSB, TPC-xHS,

Read more

Using Impala, Amazon EMR, and Tableau to Analyze and Visualize Data

Categories: Cloud General Guest

Our thanks to AWS Solutions Architect Rahul Bhartia for allowing us to republish his post below.

Apache Hadoop provides a great ecosystem of tools for extracting value from data in various formats and sizes. Originally focused on large-batch processing with tools like MapReduce, Apache Pig, and Apache Hive, Hadoop now provides many tools for running interactive queries on your data, such as Impala, Drill, and Presto. This post shows you how to use Amazon Elastic MapReduce (Amazon EMR) to analyze a data set available on Amazon Simple Storage Service (Amazon S3) and then use Tableau with Impala to visualize the data.

Read more

The Definitive "Getting Started" Tutorial for Apache Hadoop + Your Own Demo Cluster

Categories: CDH Cloud General Hadoop How-to

Using this new tutorial alongside Cloudera Live is now the fastest, easiest, and most hands-on way to get started with Hadoop.

At Cloudera, developer enablement is one of our most important objectives. One only has to look at examples from history (Java or SQL, for example) to know that knowledge fuels the ecosystem. That objective is what drives initiatives such as our community forums, the Cloudera QuickStart VM,

Read more