In this installment of “Meet the Instructor,” we speak to San Francisco-based Glynn Durham, one of the big brains behind Cloudera’s Introduction to Data Science training and certification.
What is your role at Cloudera?
I am a Senior Instructor with Cloudera University, which means I am a road warrior: I will travel anywhere to teach anything to anyone. I teach all the courses Cloudera offers, including custom private training events that I run at customer sites. Right now, I’m especially enjoying teaching Cloudera’s new course, Introduction to Data Science: Building Recommender Systems. In tandem with the rollout of the course, we’re developing Cloudera Certified Professional: Data Scientist exams, which will include a challenging performance-based lab component in addition to the written test.
Prior to Cloudera, I primarily came from a database background. My first corporate job was at Oracle just before it went public. I spent a year producing Oracle’s first batch of course materials for developers and database administrators and then spent several years teaching all kinds of people all over the world. For some time, I was an Oracle Database Administrator. I eventually moved on to the LAMP code stack, and I later worked for MySQL.
What do you enjoy most about training and/or curriculum development?
For me, Hadoop represents the continuation of a lifelong professional interest in digital data as a useful abstraction of the real world. As an instructor, there’s nothing better than observing an “ah-ha” moment. Even better yet, I love being part of the circuit that causes the proverbial cartoon light bulb to switch on above a person’s head. Training allows me to contribute to that process all the time—I’m even the one on the learning end now and again! I enjoy meeting people whose eyes light up when they start to think about what a platform like Hadoop enables. Cloudera’s training is dedicated to opening up a new realm of possibilities for everyone involved. I get to learn a lot, not just about Cloudera’s technology (which is awesome, by the way), but also about what customers are finding to do with their data. It just goes on and on!
Describe an interesting application you’ve seen or heard about for Apache Hadoop.
I get my healthcare coverage from Kaiser Permanente, which has offered members the opportunity to participate in a huge project sequencing individuals’ genomes from blood samples. Kaiser hopes to collect the genomes of 500,000 members, which, paired with members’ health records, can provide the data for countless new discoveries about the genetic component of health processes. I don’t know for sure that Kaiser uses Hadoop in this project, but I know this is an eminently Hadoop-able application.
What advice would you give to an engineer, system administrator, or analyst who wants to learn more about Big Data?
The term Big Data refers at least in part to the acceleration of data growth in organizations of all sorts. Our more mature technologies such as RDBMS and spreadsheets remain useful as ways to work with data. However, with the acceleration we see now, it’s critical to adapt and find new ways to store and process data at profoundly higher scales than before. One of my favorite points about Cloudera is that our mission statement is about data. Hadoop is clearly an important platform that helps users succeed in more ways with more data. But let’s be clear: it’s the data that’s most important.
Hadoop is an open-source project, and Cloudera’s CDH distribution is completely Apache-licensed: free and open-source forever. Go to Downloads on Cloudera’s website and get your own cluster up and running using Cloudera Manager, or get a Virtual Machine image running a single-node cluster, all ready to go. Watch our videos! Read a few blogs or white papers! It’s the nature of open-source projects to make intellectual property available to anyone who wants it. So that’s my main advice: jump in and start swimming!
Cloudera’s business—including training—is to help accelerate users’ time to mastery of the Hadoop platform (and, I expect, other exciting data technologies going forward). Our full support subscription, Cloudera Enterprise, is especially meant to help users get production clusters working and keep them working optimally to get the most value from data.
How did you become involved in technical training and Hadoop?
My enthusiasm for teaching, broadly, goes way back to junior high school in Louisiana, where I tutored a classmate who was failing geometry. Like me, he later went on to become a successful engineer. I find it rewarding to think that I actually made a difference in that guy’s life. I like to think that, all these years later, I am still doing the same thing: with a few days’ mutual commitment, we can provide real value to individuals in our classrooms.
What’s one interesting fact or story about you that a training participant would be surprised to learn?
In a previous chapter of my life, I interned with the recording engineers at Tiny Telephone, a famous recording studio in San Francisco that favors legacy analog techniques. I spent many, many hours aligning tape decks and setting up signal chains, then listening to all sorts of musicians work their magic. Great fun!