This blog post will present a simple “hello world” kind of example on how to get data that is stored in S3 indexed and served by an Apache Solr service hosted in a Data Discovery and Exploration cluster in CDP. For the curious: DDE is a pre-templeted Solr-optimized cluster deployment option in CDP, and recently […]
From a-z in 10 minutes! It is hard to believe if you have had previous experience with setting up, sizing, and deploying a distributed search engine service that this is possible. Imagine how many times IT has lost valuable time spending hours trying to understand Apache Solr application requirements and map them into how to […]
Intro The Cloudera Support Organization has always strived to not only provide solutions to our customers but to also deliver helpful knowledge. One of the primary sources of that knowledge comes from our Knowledge Articles. This content is created and curated by our knowledgeable Support Staff based on real-world experience coming from support cases. These […]
Recently, Cloudera Fast Forward held a webinar on automated question answering. What is automated question answering, you ask? In its simplest form, it’s a human-machine interaction to extract information from data using human language. This is a pretty broad definition that encapsulates the idea that machines don’t inherently understand human language any more than humans […]
Supply Chain Whiplash Much has been written about today’s unprecedented manufacturing business environment and it’s timely and warranted. Even before the late January news out of China, uncertainty was compounded by the direction of the Brexit spin-off, near trade wars with major trading partners piling tariffs on top of tariffs, and an oil war that […]
People intuitively know that self-driving or autonomous cars present complex engineering challenges. Vehicle assembly is the easy part – we’ve been doing that for 100 years. The real challenge is a data challenge, acquiring and managing the data needed to run the vehicles’ brain, eyes, and ears. Autonomous driving technology complexity lies in the ability […]
Cloudera Search is a highly scalable and flexible search solution based on Apache Solr which enables exploration, discovery and analytics over massive, unstructured and semi-structured datasets (for example logs, emails, dna-strings, claims forms, jpegs, xls sheets, etc). It has been adopted by a large number of Cloudera customers across a wide range of industries for […]
Cloudera Search is a highly scalable and flexible search solution based on Apache Solr which enables exploration, discovery and analytics over massive, unstructured and semi-structured datasets (for example logs, emails, dna-strings, claims forms, jpegs, xls sheets, etc). It has been adopted by a large number of Cloudera customers across a wide range of industries for […]
It has been a long and patient wait for Apache Hadoop 3.0 to mature. A major new version of the storage layer obviously impacts all our integrated components, including Apache Solr and all our integrations with the rest of the platform, commonly referred to as Cloudera Search. Since our customers’ Search deployments are so often […]
Configuring Apache Solr memory properly is critical for production system stability and performance. It can be hard to find the right balance between competing goals. There are also multiple factors, implicit or explicit, that need to be taken into consideration. This blog talks about some common tasks in memory tuning and guides you through the […]