In this guide, learn how to use Cloudera Search with Basis Technology’s Rosette® to perform fuzzy name searches in multiple languages and scripts.
Our thanks to Basis Technology team (Jeanne Le Garrec, Hannah MacKenzie-Margulies and Brian Sawyer) for supporting writing this how-to blog.
Cloudera Search, powered by Apache Solr brings full-text, interactive search, and scalable indexing to Apache Hadoop by marrying SolrCloud with HDFS, Apache HBase,
This new release adds support for Amazon EBS volumes and the ability to diagnose cluster bootstrap errors quickly.
Cloudera Director provides a simple, reliable, enterprise-grade way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. Cloudera Director enables you to deploy production-ready clusters for big data applications and successfully run workloads in the cloud.
Cloudera Director makes it easier for customers to:
- Deploy clusters in line with patterns native to cloud infrastructure
- Use an interface to define in one place the desired cluster specification all the way down to the operating system
- Repeatedly and programmatically instantiate these cluster definitions
- Adapt to the dynamic nature of cloud infrastructure
Cloudera Director 2.2 provides additional mechanisms to get that initial cluster definition right and the ability to diagnose errors and iterate quickly.
Contributors from Intel, Cloudera, and the rest of the community have been making strong progress on the Hive-on-Spark initiative. This post provides an update.
[Editor’s note (April 20, 2016): Hive-on-Spark is now GA/shipping starting in CDH 5.7.]
Since its inception about one year ago, the community initiative to make Apache Spark a data processing engine for Apache Hive (HIVE-7292) has attracted widespread interest from developers around the world and gone through phases of rapid development,
Now there’s an even quicker “QuickStart” option for getting hands-on with the Apache Hadoop ecosystem and Cloudera’s platform: a new Docker image.
You might already be familiar with Cloudera’s popular QuickStart VM, a virtual image containing our distributed data processing platform. Originally intended as a demo environment, the QuickStart VM quickly evolved over time into quite a useful general-purpose environment for developers, customers,
Cloudera has announced support for Spark SQL/DataFrame API and MLlib. This post explains their benefits for app developers, data analysts, data engineers, and data scientists.
In July 2015, Cloudera re-affirmed its position since 2013: that Apache Spark is on course to replace MapReduce as the default general-purpose data processing engine for Apache Hadoop. Thanks to initiatives like the One Platform Initiative,