The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready.
Recently, Apache Spark has become the most currently active project in the Apache Hadoop ecosystem (measured by number of contributors/commits over time),
Meet Sravya Tirukkovalur (@sravsatuluri), a Software Engineer working on Apache Hadoop security at Cloudera.
What do you do at Cloudera, and in which Apache projects are you involved?
I am a software engineer here at Cloudera, working on the security aspects of the platform. I specifically work on and an active contributor to the Apache Sentry (incubating) project, which is part of the Project Rhino effort with Intel to bring comprehensive security for data protection to Hadoop.
Meet Sandy Ryza (@SandySifting), the newest member of Cloudera’s data science team. See Sandy present at Spark Summit 2014 (June 30-July 1 in San Francisco; register here for a 20% discount).
What is your definition of a “data scientist”?
To put it in boring terms, data scientists are people who find that the bulk of the work for testing their hypotheses lies in manipulating quantities of information –
In this installment of “Meet the Engineer”, our subject is Andrei Savu!
What do you do at Cloudera?
At Cloudera I work on cloud deployment automation and general platform improvements to make sure everything runs smoothly on elastic infrastructure when using various managed services. My team builds on top of Cloudera Manager and we integrate with different cloud provider APIs to provision production Cloudera Enterprise Data Hub Edition clusters on-demand,
In this installment of “Meet the Instructor”, our interview subject is Bruce Martin.
What is your role at Cloudera?
I am a Senior Instructor at Cloudera. I teach all of our courses. I most often teach our Data Science, Developer, and Data Analyst courses, all of which make up the Developer Learning Path.
What do you enjoy most about training and/or curriculum development?