Fixes in CDH 5.5 make writing Parquet data for Apache Impala (incubating) much easier.
Over the last few months, several Cloudera customers have provided the feedback that Parquet is too hard to configure, with the main problem being finding the right layout for great performance in Impala. For that reasons, CDH 5.5 contains new features that make those configuration problems go away.
Auto-Detection of HDFS Block Size
For example, you may have seen this warning: Read <some-big-number>