Category Archives: Search

How-to: Install Cloudera Manager and Cloudera Search with Ansible

Categories: Cloudera Manager Guest Ops and DevOps Search

The following guest post is re-published here courtesy of Gerd König, a System Engineer with YMC AG. Thanks, Gerd!

Cloudera Manager is a great tool to orchestrate your CDH-based Apache Hadoop cluster. You can use it from cluster installation, deploying configurations, restarting daemons to monitoring each cluster component. Starting with version 4.6, the manager supports the integration of Cloudera Search, which is currently in Beta state.

Read more

Introducing Morphlines: The Easy Way to Build and Integrate ETL Apps for Hadoop

Categories: Hadoop Kite SDK Search

This post is the first in a series of blog posts about Cloudera Morphlines, a new command-based framework that simplifies data preparation for Apache Hadoop workloads. To check it out or help contribute, you can find the code here.

Cloudera Morphlines is a new open source framework that reduces the time and effort necessary to integrate, build, and change Hadoop processing applications that extract, transform, and load data into Apache Solr,

Read more

The Blur Project: Marrying Hadoop with Lucene

Categories: Hadoop Search

Doug Cutting’s recent post about Cloudera Search included a hat-tip to Aaron McCurry, founder of the Blur project, for inspiring some of its design principles. We thought you would be interested in hearing more about Blur (which is mentored by Doug and Cloudera’s Patrick Hunt) from Aaron himself – thanks, Aaron, for the guest post below!

Blur is an Apache Incubator project that provides distributed search functionality on top of Apache Hadoop,

Read more

Hadoop for Everyone: Inside Cloudera Search

Categories: Hadoop Search

CDH, Cloudera’s 100% open source distribution of Apache Hadoop and related projects, has successfully enabled Big Data processing for many years. The typical approach is to ingest a large set of a wide variety of data into HDFS or Apache HBase for cost-efficient storage and flexible, scalable processing. Over time, various tools to allow for easier access have emerged — so you can now interact with Hadoop through various programming methods and the very familiar structured query capabilities of SQL.

Read more

QuickStart VM: Now with Real-Time Big Data

Categories: Hadoop Hue Impala QuickStart VM Search Training

For years, Cloudera has provided virtual machines that give you a working Apache Hadoop environment out-of-the-box. It’s the quickest way to learn and experiment with Hadoop right from your desktop.

We’re constantly updating and improving the QuickStart VM, and in the latest release there are two of Cloudera’s new products that give you easier and faster access to your data: Cloudera Search and Cloudera Impala.

Read more