What is in our Kitchen?
If there is one thing that chefs are proud of, it’s their kitchens. Whether cavernous top-of-the-line affairs or cramped New York apartments, kitchens are the place where raw ingredients are combined with talent and hard work to produce results. The only difference in the world of software is what you will find in our kitchens. In an interview with CNET, Google’s Hal Varian attributed Google’s success to the “kitchen” in which their products are developed:
“I also think we have a better kitchen. We’ve put a lot of effort into building a really powerful infrastructure at Google, the development environment at Google is very good.”
The goal of the Kitchen team at Cloudera is to create a powerful infrastructure for developing, building, testing, shipping, and supporting our software. Kitchen contributes its expertise to every product Cloudera builds, while also building out new infrastructure and tools to facilitate future development. Everyone on the Kitchen team writes software.
While the Kitchen team’s culture was initially inspired by Google’s infrastructure, we agree with Piaw Na who recently provided some words of caution for companies looking to follow this example:
“In short, I think startups have to be very careful about building generic infrastructure just because that’s the way Google did things.”
The Kitchen team builds the infrastructure that is needed to solve our company’s problems. For example, our build system must be capable of coalescing many disparate open source projects into a unified platform. If there is an existing open source tool or framework that meets our needs we use it, improve it, and contribute it back to the project rather then “rolling our own”
We use many of the open source tools you might expect, such as Hudson for continuous integration. Our Hudson instance manages tens of hosts running over seventy projects:
- Unit tests running on every commit, across multiple platforms, and flavors of Java or Python
- Hadoop clusters running on EC2 using Apache Whirr
- Various code improvement tools such as jcarder, Cobertura, Clover, FindBugs, CheckStyle and others
If a tool does not exist the Kitchen team tries to leverage existing frameworks to build what is required. For example, our automated build and release system, which is at the heart of the Cloudera Distribution for Hadoop (CDH) platform, is built on top of boto. From a single git repository, we use crepo (another Kitchen project) to check out the latest source of each project within CDH. Then we build source artifacts for all of the projects, which get uploaded to S3. We then spin up an EC2 cluster to build everything for all the supported CentOS releases, Ubuntu, and Debian releases, including both 32 and 64-bit architectures. The resulting packages are stored back in S3, and then staged to a fresh EC2 instance of archive.cloudera.com for testing. Additional EC2 instances follow and run end-to-end package tests for each package that was built. We turn the crank nightly, not just for each release.
The Kitchen team is in the process of building a status, dashboard, radiator, single-pane-of-glass to prominently display Hudson’s status, nightly builds, JIRA stats, CDH download statistics, and many other metrics we use daily.
No software company is complete without a cluster or two. Kitchen maintains a development cluster, a long-lived CDH cluster, a security-enabled CDH cluster, and a “dog-food” cluster. We’re currently building out a Eucalyptus cluster so we can also run our build and test infrastructure in house. We have a large scale cluster in the works and we are busy building out our infrastructure to accommodate it. We use Cobbler, run Ganglia (bias alert, we employ one of the original authors), debate Chef and Puppet.
Our Kitchen team is growing. If this sounds like a team you would like to be a part of, get in touch with me on twitter or IRC (#cloudera on freenode.net) or apply directly. Stay tuned for more blog posts about what’s cooking in our Kitchen.
Image courtesy of Chef Olive at Kitchen On Fire