As a delicious appetizer for the Strata Conference + Hadoop World next week (sold out!), O’Reilly Media has partnered with us to create and publish a new e-book specifically intended for technical end-users of Cloudera Impala, the open source distributed query engine for Apache Hadoop.
Authored by Cloudera’s own John Russell, the e-book provides a 30-page tour of Impala’s internals and architecture, as well as common usage patterns intended for mainstream (SQL) users.
As John explains in his introductory post on O’Reilly’s Strata blog:
“I wanted to give an overview that didn’t rely on already being an expert with Hadoop, Hive, Java, some particular database system, and so on. With Impala, a little SQL and UNIX experience is all you really need. The patterns are familiar, even if the terminology is a little different. An end user doesn’t need to concern themselves with the underlying plumbing. But depending on where they’re coming from, they might have definite ideas about which logical or physical aspects are important.”
You can download this new e-book from cloudera.com right now (registration required). But for those of you lucky enough to be attending the Impala + Parquet meetup on Tuesday evening, we have another treat: a box of hardcopies!