Tag Archives: hadoop world

Tutorials at Strata + Hadoop World San Jose: Architecture, Hadoop Ops, Interactive SQL-on-Hadoop

Categories: Events Hadoop Impala

Strata + Hadoop World San Jose 2015 (Feb. 17-20) is a focal point for learning about production-izing Hadoop.

Strata + Hadoop World sessions have always been indispensable for learning about Hadoop internals, use cases, and admin best practices. When deep learning is needed, however—and deep dives are a necessity if you’re running Hadoop in production, or aspire to—tutorials are your ticket.

This year, tutorials span a range of topics that are central in today’s Hadoop conversation,

Read More

Advanced Analytics with Apache Spark: The Book

Categories: Books Data Science Events Spark

Authored by a substantial portion of Cloudera’s Data Science team (Sean Owen, Sandy Ryza, Uri Laserson, Josh Wills), Advanced Analytics with Spark (currently in Early Release from O’Reilly Media) is the newest addition to the pipeline of ecosystem books by Cloudera engineers. I talked to the authors recently.

Why did you decide to write this book?

We think it’s mostly to fill a gap between what a lot of people need to know to be productive with large-scale analytics on Apache Hadoop in 2015,

Read More

Where to Find Cloudera Tech Talks (through March 2015)

Categories: Community Events

Find Cloudera tech talks in Austin, London, Washington DC, Zurich, and other cities through March 2015.

Below please find our regularly scheduled quarterly update about where to find tech talks by Cloudera employees—this time, through the first quarter of calendar year 2015. Note that this list will be continually curated during the period; complete logistical information may not be available yet. And remember, many of these talks are in “free” venues (no cost of entry).

Read More

The Top 10 Posts of 2014 from the Cloudera Engineering Blog

Categories: Community Hadoop Spark

Our “Top 10” list of blog posts published during a calendar year is a crowd favorite (see the 2013 version here), in particular because it serves as informal, crowdsourced research about popular interests. Page views don’t lie (although skew for publishing date—clearly, posts that publish earlier in the year have pole position—has to be taken into account). 

In 2014, a strong interest in various new components that bring real time or near-real time capabilities to the Apache Hadoop ecosystem is apparent. 

Read More