Our thanks to Don Drake (@dondrake), an independent technology consultant who is currently working at Allstate Insurance, for the guest post below about his experiences comparing use of the Apache Avro and Apache Parquet file formats with Apache Spark. Over the last few months, numerous hallway conversations, informal discussions, and meetings have occurred at Allstate […]
Our thanks to Rakesh Rao of Quaero, for allowing us to re-publish the post below about Quaero’s experiences using partitioning in Apache Hive. In this post, we will talk about how we can use the partitioning features available in Hive to improve performance of Hive queries. Partitions Hive is a good tool for performing queries […]