Search Archives - Page 3 of 4

May 11, 2017 | Technical

How-to: Backup and disaster recovery for Apache Solr (part I)

Cloudera Search (that is Apache Solr integrated with the Apache Hadoop eco-system) now supports (as of C5.9) a backup and disaster recovery capability for Solr collections. In this post we will cover the basics of the backup and disaster recovery capability in Solr and hence in Cloudera Search. In the next post we will cover […]

by Hrishikesh Gadre 5 min read

Apache Hadoop Search

March 28, 2017 | Technical

How-to: Log Analytics with Solr, Spark, OpenTSDB and Grafana

Organizations analyze logs for a variety of reasons. Some typical use cases include predicting server failures, analyzing customer behavior, and fighting cybercrime. However, one of the most overlooked use cases is to help companies write better software. In this digital age, most companies write applications, be it for its employees or external users. The cost […]

by Michael Sun , Jeffrey Shmain 9 min read

January 10, 2017 | Technical

How-to: Fuzzy Name Indexing in Apache Hadoop with Rosette and Cloudera Search

In this guide, learn how to use Cloudera Search with Basis Technology’s Rosette® to perform fuzzy name searches in multiple languages and scripts. Our thanks to Basis Technology team (Jeanne Le Garrec, Hannah MacKenzie-Margulies and Brian Sawyer) for supporting writing this how-to blog. Cloudera Search, powered by Apache Solr brings full-text, interactive search, and scalable […]

by Cloudera 4 min read

Cloudera Enterprise Search

October 11, 2016 | Business

How-to: Secure Apache Solr Collections and Access Them Programmatically

Learn how to secure your Solr data in a policy-based, fine-grained way. Data security is more important than ever before. At the same time, risk is increasing due to the relentlessly growing number of device endpoints, the continual emergence of new types of threats, and the commercialization of cybercrime. And with Apache Hadoop already instrumental […]

by Jan Kunigk , Paul Wilkinson 6 min read

Apache Sentry Search Security, Risk, & Compliance

July 29, 2016 | Technical

How-to: Ingest Email into Apache Hadoop in Real Time for Analysis

Apache Hadoop is a proven platform for long-term storage and archiving of structured and unstructured data. Related ecosystem tools, such as Apache Flume and Apache Sqoop, allow users to easily ingest structured and semi-structured data without requiring the creation of custom code. Unstructured data, however, is a more challenging subset of data that typically lends […]

by Jordan Volz , Stefan Salandy 10 min read

Apache Flume Apache Hadoop Apache Kafka Apache Spark Data Ingestion Search

May 19, 2016 | Technical

How-to: Process and Index Medical Images with Apache Hadoop and Apache Solr

Thanks to Karthik Vadla, Abhi Basu, and Monica Martinez-Canales of Intel Corp. for the following guest post about using CDH for cost-effective processing/indexing of DICOM (medical) images. Medical imaging has rapidly become the best non-invasive method to evaluate a patient and determine whether a medical condition exists. Imaging is used to assist in the diagnosis […]

by Cloudera 7 min read

Apache ZooKeeper Cloudera Enterprise Search

October 15, 2015 | Technical

How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code

Learn how to use OCR tools, Apache Spark, and other Apache Hadoop components to process PDF images at scale. Optical character recognition (OCR) technologies have advanced significantly over the last 20 years. However, during that time, there has been little or no effort to marry OCR with distributed architectures such as Apache Hadoop to process […]

by Jeffrey Shmain 13 min read

Apache HBase Apache Spark Search

February 27, 2015 | Technical

How-to: Do Real-Time Log Analytics with Apache Kafka, Cloudera Search, and Hue

Cloudera recently announced formal support for Apache Kafka. This simple use case illustrates how to make web log analysis, powered in part by Kafka, one of your first steps in a pervasive analytics journey. If you are not looking at your company’s operational logs, then you are at a competitive disadvantage in your industry. Web […]

by Gwen Shapira , Jeffrey Shmain 9 min read

Apache Kafka Hue Search

July 23, 2014 | Technical

New in CDH 5.1: Document-level Security for Cloudera Search

Cloudera Search now supports fine-grain access control via document-level security provided by Apache Sentry. In my previous blog post, you learned about index-level security in Apache Sentry (incubating) and Cloudera Search. Although index-level security is effective when the access control requirements for documents in a collection are homogenous, often administrators want to restrict access to […]

by Gregory Chanan 4 min read

Apache Hadoop Apache Sentry Apache Solr Hue Cloudera Enterprise Search

November 19, 2013 | Technical

How-to: Add Cloudera Search to Your Cluster using Cloudera Manager

Cloudera Manager 4.7 added support for managing Cloudera Search 1.0. Thus Cloudera Manager users can easily deploy all components of Cloudera Search (including Apache Solr) and manage all related services, just like every other service included in CDH (Cloudera’s distribution of Apache Hadoop and related projects). In this how-to, you will learn the steps involved […]

by Vikram Srivastava 4 min read

Apache HBase Cloudera Manager Search

Filter By