In this installment of “Meet the Engineer”, meet Marcel Kornacker, the architect of the Cloudera Impala open-source real-time query engine for Apache Hadoop.
What do you do at Cloudera?
I’m a tech lead at Cloudera, working on the Cloudera Impala team. And although it’s not in my formal title, I’m also the architect of Impala. What that means in practice is that I have the very enviable but demanding job of not only creating Impala requirements, but also delivering the code that meets them!
Why do you enjoy your job?
Fundamentally, I just really enjoy building new things. Impala is a perfect example; I’ve had the opportunity to design, and build, an entirely new code base from the ground floor and up. But it gets even better, because at Cloudera I also have the opportunity to speak with customers and users about their needs and requests with respect to Impala, and then bring that feedback back into the roadmap. At most other software companies, engineers and even architects are usually completely isolated from that process.
What is your favorite thing about Hadoop?
Hadoop makes enterprise data storage, which is traditionally quite expensive, a commodity. If you’re a data analyst or application developer, you have to love that, because with Hadoop you can throw more stuff – questions or application logic – at more data than you ever could before, and do it relatively cheaply. Hadoop just allows you to solve bigger problems, or as Cloudera would put it, ask bigger questions.
What inspired you to design Impala?
Before coming to Cloudera, I worked at Google – my last role was as tech lead of the query engine for the Google F1 distributed RDBMS. This was simply a fascinating project that opened my eyes to new possibilities for querying massive amounts of data in a very scalable manner. At the same time, F1 was proprietary to Google, so the opportunity to bring those discoveries to the outside world was very limited.
The motivation to help a much broader audience take advantage of these fairly advanced concepts is really what inspired my Impala work.
At what age did you become interested and programming, and why?
I think I was around 15 when I first started programming – initially in Pascal, and later on in C. The thing that appealed to me about it was that you could practically create something out of nothing. Unlike, say, an architect or mechanical engineer, you’re not constrained by the laws of physics or the presence of construction material. You’re just encoding ideas. Furthermore, I discovered I really enjoyed the design process – to break an idea down into its components, and to have to articulate those component ideas in a programming language, which is, compared to normal prose, a fairly rigid formalism.
Join Marcel this Thursday, Jan. 10, at 11am PT for a technical webinar about Cloudera Impala. Marcel will present a technical overview of Impala from the user’s perspective, including Impala’s architecture and implementation. He will also provide a comparison of Impala with Apache Hive, commercial MapReduce alternatives, and traditional data warehouse infrastructure.