Configuring Apache Solr memory properly is critical for production system stability and performance. It can be hard to find the right balance between competing goals. There are also multiple factors, implicit or explicit, that need to be taken into consideration. This blog talks about some common tasks in memory tuning and guides you through the process to help you understand how to configure Solr memory for a production system.
For simplicity, this blog applies to Solr in Cloudera CDH5.11 running on top of HDFS.
Cloudera Search (that is Apache Solr integrated with the Apache Hadoop eco-system) now supports (as of C5.9) a backup and disaster recovery capability for Solr collections.
In this post we will cover the basics of the backup and disaster recovery capability in Solr and hence in Cloudera Search. In the next post we will cover the design of the Solr snapshots functionality and its integration with the Hadoop ecosystem as well as public cloud platforms (e.g.
In this guide, learn how to use Cloudera Search with Basis Technology’s Rosette® to perform fuzzy name searches in multiple languages and scripts.
Our thanks to Basis Technology team (Jeanne Le Garrec, Hannah MacKenzie-Margulies and Brian Sawyer) for supporting writing this how-to blog.
Cloudera Search, powered by Apache Solr brings full-text, interactive search, and scalable indexing to Apache Hadoop by marrying SolrCloud with HDFS, Apache HBase,
Solr 5 includes a completely re-written faceted search and analytics module with a structured JSON API to control the faceting and analytics commands. Here’s how it works.
Since I joined Cloudera a few years ago to help bring search-powered analytics to Cloudera’s platform, I’ve been working actively upstream alongside the rest of the Solr community to develop new functionality that will drive more interesting applications on Cloudera Search (which is based on an integration of Solr with the Apache Hadoop ecosystem).
Learn how to secure your Solr data in a policy-based, fine-grained way.
Data security is more important than ever before. At the same time, risk is increasing due to the relentlessly growing number of device endpoints, the continual emergence of new types of threats, and the commercialization of cybercrime. And with Apache Hadoop already instrumental for supporting the growth of data volumes that fuel mission-critical enterprise workloads, the necessity to master available security mechanisms is of vital importance to organizations participating in that paradigm shift.