After the GA of Apache Kudu in Cloudera CDH 5.10, we take a look at the Apache Spark on Kudu integration, share code snippets, and explain how to get up and running quickly, as Kudu is already a first-class citizen in Spark’s ecosystem.
As the Apache Kudu development team celebrates the initial 1.0 release launched on September 19, and the most recent 1.2.0 version now GA as part of Cloudera’s CDH 5.10 release,
A common design pattern often emerges when teams begin to stitch together existing systems and an EDH cluster: file dumps, typically in a format like CSV, are regularly uploaded to EDH, where they are then unpacked, transformed into optimal query format, and tucked away in HDFS where various EDH components can use them. When these file dumps are large or happen very often, these simple steps can significantly slow down an ingest pipeline. Part of this delay is inevitable;
Cloudera considers the handling and reporting of security vulnerabilities a very serious matter. In this post, learn the processes involved.
In addition to expecting enterprise-class standards for stability and reliability, Cloudera’s customers also have expectations for industry-standard processes around the discovery, fix, and reporting of security issues. In this post, I will describe how Cloudera addresses such issues in our software.
An overview of the process looks like this flowchart:
The first step in the life cycle of a security vulnerability is that it is discovered and reported to Cloudera.
Cloudera has given its documentation set a facelift, and we think you’ll like the new look. We use more whitespace and a font that is easier to read and skim, and your pages load much faster. But the improvements go beyond the merely aesthetic.
While electronic documentation has been around for decades, most online documentation is still presented as if it were printed in books. There is a table of contents that assumes you will read the content from start to finish.
Today, Cloudera is excited to introduce the industry’s first Big Data Encabulator implementation. Those very, very few of you who don’t know what that means can learn more by viewing our technical webinar series; Part 1 featuring our brightest data science minds is provided below!