Category Archives: General

How-to: Install Apache Zeppelin on CDH

Categories: General Guest How-to Spark

Our thanks to Karthik Vadla and Abhi Basu, Big Data Solutions engineers at Intel, for permission to re-publish the following (which was originally available here).

Data science is not a new discipline. However, with the growth of big data and adoption of big data technologies, the request for better quality data has grown exponentially. Today data science is applied to every facet of life—product validation through fault prediction,

Read More

Call for Demos: Developer Showcase at Strata + Hadoop World NYC 2015

Categories: Community Events General

Strata + Hadoop World New York 2015 needs your developer demos! The proposal period closes on Aug. 14.

As everyone knows, Apache Hadoop’s overwhelming success is partly premised on de-centralized innovation from all corners of the community—users, vendors, and academia—with everyone participating on a level playing field. And since 2011, Strata + Hadoop World has been a community and content hub of that ecosystem.

For the 2015 show in New York (Sept.

Read More

Got SQL? Xplain.io Joins Cloudera

Categories: General

Xplain.io is now part of Cloudera. 

Fifteen months ago, Rituparna Agrawal and I incorporated Xplain.io in a small shed in my backyard. With intense focus on solving real customer problems, we built an eclectic and diverse team with skills across database internals, distributed systems, and customer-centric design.

Throughout Q4 2013, we interviewed more than 60 enterprise data architects and found that they were all overwhelmed with the choices available in modern data management.

Read More

BigBench: Toward An Industry-Standard Benchmark for Big Data Analytics

Categories: General Hardware Performance

Learn about BigBench, the new industrywide effort to create a sorely needed Big Data benchmark.

Benchmarking Big Data systems is an open problem. To address this concern, numerous hardware and software vendors are working together to create a comprehensive end-to-end big data benchmark suite called BigBench. BigBench builds upon and borrows elements from existing benchmarking efforts in the Big Data space (such as YCSB, TPC-xHS,

Read More