Since the last blog post announcing the release of YCSB 0.6.0 in Cloudera Labs, users of Cloudera CDH and EDH will have noticed regular updates to the Labs version, keeping it in lockstep with the upstream release. This should help assure users of a consistent and easy mechanism to deploy the current version of YCSB (which at the moment is v.0.10.0 in CLABS) to evaluate the performance of the NoSQL stores employed within their clusters such as HBase,
As measured across multiple dimensions (see analysis below), Impala provides a better cloud-native experience than Redshift for a number of common use cases.
Impala 2.6 brings read/write support on Amazon S3, which provides cloud capabilities such as direct querying of data from S3, elastic scaling of compute, and seamless data portability and flexibility that are unique amongst cloud-based analytic databases. With more and more users looking to deploy and run in public-cloud environments,
The benchmark testing results detailed below can help you make an informed decision about AWS storage options for Impala.
In a recent post, you learned how Impala 2.6 on S3 delivers cloud-native features unmatched by other analytic databases in the cloud. With support to read/write data from Amazon S3, Impala provides cloud capabilities such as direct querying of data from S3, elastic scaling of compute, and seamless data portability and flexibility not found on other cloud-based analytic databases, such as Amazon Redshift.
This case study is an instructive example of how performance analysis is a multi-faceted process that often leads one in surprising directions.
Apache Solr Near Real Time (NRT) Search allows Solr users to search documents indexed just seconds ago. It’s a critical feature in many real-time analytics applications. As Solr indexes more and more documents in near real time, end-user expectations for performance get higher and higher.
Thanks to new optimizations for running Impala on Amazon S3, doubling cluster size on AWS doubles multi-user performance while keeping total workload cost roughly the same.
With public-cloud deployments becoming increasingly popular, Cloudera is continuing to build out the capabilities of its platform to best take advantage of the cost-effective and flexible nature of the cloud. The current release of Cloudera’s platform (5.8) includes a major step forward in that area with Impala 2.6 able to store and query data directly from the Amazon S3 object store.