In this installment of “Meet the Engineer”, we meet with Eric Sammer (invariably known as just plain “Sammer”), Apache committer and author of the upcoming O’Reilly book, Hadoop Operations.
What do you do at Cloudera, and in which Apache project are you involved?
I’ve been lucky enough to be part of a few different teams at Cloudera since I joined. Almost three years ago, I joined Cloudera as a Solution Architect; a member of the professional services team. Most of my time was spent working with customers to build out Apache Hadoop and Apache HBase clusters, and designing data integration and processing pipelines. I also occasionally had the opportunity to fill in with the training team, teaching Cloudera’s Hadoop Developer and Administration courses to both public and private groups. There’s nothing more exciting than getting to hang out with a group of smart people and talk about Hadoop all day. I moved into a Principal Solution Architect role, spending more time in the office, working on architectural patterns and problems that repeat across customers, and working with internal teams on ways to improve CDH and Cloudera Manager.
Hadoop isn’t an island, however, and as more and more of Cloudera’s partners began integrating with CDH and Cloudera Manager, I started focusing on helping them answer many of the same questions as our customers. This was the beginning of our Partner Engineering team, which fulfills mostly the same function as the Solution Architect team, but for partners. As part of this team, I got a chance to work with our partners on projects like Oracle’s Big Data Appliance, the combined Dell | Cloudera Solution for Hadoop, and HP’s recently released HP AppSystem, all of which include both CDH and Cloudera Manager.
Today, I’m back in a more traditional engineering role, hacking on systems related projects. Over the last few years, Cloudera has given me the opportunity to contribute to a number of open source projects, but notably Apache Flume and Apache MRUnit, where I’m a committer and PMC member on both projects.
Why do you enjoy your job?
What’s not to like? I get to work with top notch talent, hack on open source code, write a book, and speak at conferences and meetup groups. Being at Cloudera has given me a kind of back stage pass to see what other folks are doing with Hadoop and HBase, as well as be part of the process. No where else would I have worked on some of these projects. I think we do a great job of finding the intersection of what’s exciting to work on with what’s useful to users. A large part of that is open source development and contribution, something core to how things work here.
It’s equally important to be able to see positive results. Whether that means shipping a release or watching a customer run their first MapReduce job on a multi-hundred node cluster, it’s nice to be able to point to something you contributed to at the end of the day. It’s a good feeling.
What is your favorite thing about Hadoop?
I think the best part about the Hadoop ecosystem is how far it has come while it’s still so young. When you look at what has been done, even in just the last few years, you have to be impressed. Hadoop is taking on workloads that run billion dollar businesses in a much wider set of industries than when it initially started. It’s true that all things Hadoop are the rage these days, but there’s something behind it; it’s not just smoke and mirrors. Consider the feature set and usage of Hadoop only two years ago versus today. The trajectory looks different than a lot of other systems out there.
It’s also about the surrounding systems in the ecosystem. More and more, people are talking about high level data processing languages and systems, real time serving, data integration, and management rather than the nuts and bolts of core Hadoop. That’s not to dismiss the criticality of Hadoop proper, but to say that it’s the bedrock of something bigger.
What is your advice for someone who is interested in participating in any open source project for the first time?
Pony up and get involved. I’ve never met a developer that didn’t have a gripe with a library or application. Grab the source, fix it, and submit it back. Patches speak louder than words. Too often, developers spend time commenting on what someone else has done rather than saying, “here’s how I think it should work” by way of code. I’m not a sports guy, but there’s an analogy to the armchair quarterback here, somewhere.
If you’re just starting out, find an existing project that solves a problem you have or feel passionate about. Ultimately, you always have to be working on something that means something to you. Working on someone else’s dream is something most people aren’t good at.
Be cognizant of the fact that no one has to work with you (like they do at the office) – you have to make them want to work with you. Sure, you have to be smart and you also have to be able to produce, but you have to do it in a way that makes people receptive to that contribution.
If you’re thinking about contributing to a project, hang out on the mailing lists for a bit and learn how things work. It’s hard to undo a bad first impression. Recognize that the world is full of smart people; a little humility goes an awfully long way.
At what age did you become interested and programming, and why?
I have a background in music. My introduction to the mechanics of programming came from learning about synth programming: waveforms, LFOs, VCAs, and eventually topics like grain table synthesis (although we didn’t call it that, back then) and how much of music and sound was just about patterns and numbers. I taught myself C and learned all I could about digital signal processing somewhere around the age of fifteen. The concept of fabricating something – especially something like sound – out of nothing was absolutely incredible to me. I played around with the patterns and relationships between music and programming for some time, but by nineteen, I had a “real job” building web apps. Some time later, I realized data storage, processing, and distributed systems were far more interesting to me. In other words, my CSS-box-model-phobia reached an all time high.
I’m still fascinated by recurring patterns in systems, software, music, and the rhythmic sound the San Francisco MUNI escalator at Montgomery and Market makes. It’s hypnotic.