Your contributions, and a vibrant developer community, are important for Impala’s users. Read below to learn how to get involved.
From the moment that Cloudera announced it at Strata New York in 2012, Impala has been an 100% Apache-licensed open source project. All of Impala’s source code is available on GitHub—where nearly 500 users have forked the project for their own use—and we follow the same model as every other platform project at Cloudera: code changes are committed “upstream” first, and are then selected and backported to our release branches for CDH releases.
However, Impala was (and still is!) moving at an extraordinary pace, and we focused on meeting the feature and stability needs of our user community at the expense of making it easy for the developer community to contribute fixes and new features. After the release of Impala 2.0, which was a major milestone for the project, we’ve been working on incubating a developer community. Now we’re ready to more widely publicize our improvements, and invite more developers to come and build the state-of-the-art SQL query engine on Hadoop with us.
Since January 2015, we have moved more and more of Impala’s development out into the open, where members of the community can watch the progress of the JIRA issues that are important to them (including through the code review process), participate in technical discussion on the Impala developer mailing list, and can submit new patches for inclusion in Impala.
We are committed to making it easier for our developer community to work with Impala’s source code. We’ve recently released a developer environment Docker image that contains everything you need to start building and working with Impala. We have also posted the full set of generated developer documentation to help navigate the hundreds of classes and thousands of lines of code that make up Impala’s internals. We’re in the process of adding a Jenkins-based test server to make it easier for contributors to validate their patches—watch this space! Finally, we’ve gathered together some of the published papers on Impala, and the presentations that we’ve been giving at meetups and conferences around the world.
If you’d like to get involved in contributing to Impala, start with the presentation we gave at the March 2015 Impala Meetup in Cloudera’s Palo Alto office, which provides an overview of our contribution process:
Henry Robinson is a Software Engineer at Cloudera on the Impala team.
Marcel Kornacker is a tech lead at Cloudera and the founder and architect of Impala.