Apache Avro 1.3.0
Apache Avro was added the to Hadoop family last April and last year there were three Avro releases: 1.0.0 in July, 1.1.0 in September and 1.2.0 in October. After the 1.2.0 release, Doug Cutting introduced Avro: a New Format for Data Interchange on this blog and the Avro team went right to work building the next release of Avro.
It’s a new year and there’s a new Avro: 1.3.0.
Starting with Avro 1.3.0, the Avro team is releasing packages specially tailored to consumers of each language. For example, Python users can download an egg, Java users can manage jars using Maven and C/C++ users can grab an autotools package ready to
`./configure; make`. Speaking of languages, we’re thrilled to announce that there’s a Ruby implementation for Avro now!
The Avro specification has been updated to include support for Avro RPC over HTTP. Currently, only Java and Python support this new RPC specification but you can expect other languages to follow. The Avro team also designed a test framework to ensure interoperability between any mix of Avro RPC clients and servers.
In Avro 1.3.0, there’s a new Avro data file format that is simpler, better suited for compression and provides support for streaming Avro data. You’ll find support for this new file format in Ruby, Python, Java and C; giving you an array of languages to choose from for reading and writing Avro data.
There have been more features added to Avro than can fit in a single blog post but here are some of the highlights.
- Substantial improvements to Reflection API. Now uses java.lang.String for Avro strings, either Java collections or arrays for Avro arrays, etc.
- New GenAvro tool provides a high-level syntax for schemas and protocols.
- Command-line tools jar for debugging.
- An RPC statistics system.
- Support for compression in data files
- Better Maven support including a mvn-install ant task to publish jar to local Maven repository, plus source and javadoc artifacts.
- Substantial performance improvements.
- Many bug fixes.
- Rewritten to be slightly more Pythonic, simpler, and with greater test coverage
- RPC over HTTP support
- RPC and data file interoperability
- New command-line utility for sending and receiving RPCs
- Python eggs created
The C++ implementation now uses autotools for its build, has a new API for checking schema resolution and provides a new tutorial to make it easier for you to get up and running with Avro in C++.
Ruby hackers will be happy to hear that Ruby has been added to Avro 1.3.0 complete with support for the new data file format.
The C implementation has been completely rewritten from top to bottom and
- supports reading and writing the new Avro data file format
- adds a contact database example to make it easier for you learn the Avro C API
- provides schema validation, promotion and projection
- allows schema validation to be optional
- removes all dependencies on external libraries (e.g. APR, APR-util)
- embeds jansson for JSON parsing
You can contact the Avro team by visiting the
#avro irc channel on irc.freenode.net or through one of the Avro mailing lists. The Avro team is always open to suggestions about future features and would love to hear about your experiences using Avro 1.3.0.