This is a guest post from RichRelevance Principal Architect and Apache Avro PMC Chair Scott Carey.
In Early 2010 at RichRelevance, we were searching for a new way to store our long lived data that was compact, efficient, and maintainable over time. We had been using Hadoop for about a year, and started with the basics text formats and SequenceFiles. Neither of these were sufficient. Text formats are not compact enough,