Apache Solr Archives - Cloudera Blog

November 25, 2020 | Technical

How a Discovery Data Warehouse, the next evolution of augmented analytics, accelerates treatments and delivers medicines safely to patients in need

I met Matthew in New York City about a year ago. We sat in a private conference room and he told me the story of his pharma startup. A small group of researchers set out to solve the black-box enigma of certain kinds of vicious cancers. There are so many cancers, so their vision was […]

by Cloudera 7 min read

October 28, 2020 | Technical

Log Reduction Techniques with CFM

Cloudera services logs offer a breadth of information to assist in cluster maintenance; from assisting in security checks, auditing tasks, and validation for performance tuning and testing tasks – to name a few. However, log records generated by these services do not hold the same value for every organisation. For example Cyber teams may find […]

by Cloudera , Pierre Villard 7 min read

Apache NiFi Apache Solr Cloudera Data Platform Cloudera Enterprise DataFlow

October 15, 2020 | Technical

How-to: Index Data from S3 via NiFi Using CDP Data Hubs

About this Blog Data Discovery and Exploration (DDE) was recently released in tech preview in Cloudera Data Platform in public cloud. In this blog we will go through the process of indexing data from S3 into Solr in DDE with the help of NiFi in Data Flow. The scenario is the same as it was […]

by Cloudera , Geza Nagy , Miklos Kertesz 8 min read

September 15, 2020 | Technical

Access control for Azure ADLS cloud object storage

CDP for Azure introduces fine-grained authorization for access to Azure Data Lake Storage using Apache Ranger policies. Cloudera and Microsoft have been working together closely on this integration, which greatly simplifies the security administration of access to ADLS-Gen2 cloud storage. Apache Ranger provides a centralized console to manage authorization and view audits of access to […]

by Madhan Neethiraj 5 min read

Apache Atlas Apache HDFS Apache Hive Apache Kafka Apache Ranger Apache Solr Cloud SDX Technologies Security, Risk, & Compliance

September 9, 2020 | Technical

How-to: Index Data from S3 Using CDP Data Hub

This blog post will present a simple “hello world” kind of example on how to get data that is stored in S3 indexed and served by an Apache Solr service hosted in a Data Discovery and Exploration cluster in CDP. For the curious: DDE is a pre-templeted Solr-optimized cluster deployment option in CDP, and recently […]

by Cloudera , Geza Nagy , Miklos Kertesz 7 min read

September 1, 2020 | Technical

Discover and Explore Data Faster with the CDP DDE Template

From a-z in 10 minutes! It is hard to believe if you have had previous experience with setting up, sizing, and deploying a distributed search engine service that this is possible. Imagine how many times IT has lost valuable time spending hours trying to understand Apache Solr application requirements and map them into how to […]

by Cloudera 7 min read

February 20, 2020 | Technical

Real-time log aggregation with Apache Flink Part 2

Introduction We are continuing our blog series about implementing real-time log aggregation with the help of Flink. In the first part of the series we reviewed why it is important to gather and analyze logs from long-running distributed jobs in real-time. We also looked at a fairly simple solution for storing logs in Kafka using […]

by Cloudera , Matyas Orhidi , Simon Elliston Ball 8 min read

Apache Flink Apache Kafka Apache Solr Cloudera Data Platform Cloudera Data Science Workbench Data Engineering DataFlow Modernize Architecture Security, Risk, & Compliance Streaming

January 22, 2020 | Technical

Real-time log aggregation with Flink Part 1

Introduction Many of us have experienced the feeling of hopelessly digging through log files on multiple servers to fix a critical production issue. We can probably all agree that this is far from ideal. Locating and searching log files is even more challenging when dealing with real-time processing applications where the debugging process itself can […]

by Cloudera , Matyas Orhidi , Simon Elliston Ball 8 min read

Apache Flink Apache Kafka Apache Solr Hue DataFlow Modernize Architecture Security, Risk, & Compliance Streaming

October 22, 2018 | Technical

Enterprise Search with HDP Search

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. We are excited to announce the immediate availability of HDPSearch 4.0. As you are aware, HDP Search offers a performant, scalable, and fault-tolerant enterprise search solution. With HDP Search 4.0, we have added […]

by Cloudera 3 min read

Apache Solr

July 23, 2014 | Technical

New in CDH 5.1: Document-level Security for Cloudera Search

Cloudera Search now supports fine-grain access control via document-level security provided by Apache Sentry. In my previous blog post, you learned about index-level security in Apache Sentry (incubating) and Cloudera Search. Although index-level security is effective when the access control requirements for documents in a collection are homogenous, often administrators want to restrict access to […]

by Cloudera 4 min read

Apache Hadoop Apache Sentry Apache Solr Hue Cloudera Enterprise Search

Filter By