A Guide to Learning with Limited Labeled Data

Categories: AI and Machine Learning Fast Forward Labs

This was originally published on the Fast Forward Labs blog

We are excited to releaseĀ Learning with Limited Labeled Data, the latest report and prototype from Cloudera Fast Forward Labs.

Being able to learn with limited labeled data relaxes the stringent labeled data requirement for supervised machine learning. Our report focuses on active learning, a technique that relies on collaboration between machines and humans to label smartly.

Read more

What’s New in Cloudera Altus Director 6.2?

Categories: CDH Cloud Cloudera Director

Cloudera Altus Director helps you deploy, scale, and manage Cloudera clusters on AWS, Microsoft Azure, or Google Cloud Platform. Altus Director both enables and enforces the best practices of big data deployments and cloud infrastructure. Altus Director’s enterprise-grade features deliver a mechanism for establishing production-ready clusters in the cloud for big data workloads and applications in a simple, reliable, automated fashion. In this post, you will learn about new functionality and changes in release 6.2.

Read more

What’s new in the Hue Data Warehouse Editor in Cloudera 6.2

Categories: Analytic Database Hue

Self-service exploratory analytics is one of the most common use cases we see by our customers running on Cloudera’s Data Warehouse solution.

With the recent release of Cloudera 6.2, we continue to improve the end user query experience with Hue, focusing on easier SQL query troubleshooting and increased compatibility with Hive. Read on to learn more and try it out in one-click at demo.gethue.com.

Easier SelfService Query Troubleshooting

Hue has great assistance for finding tables in the Data Catalog and getting recommendations on how to write (better) queries with the smart autocomplete,

Read more

Testing Apache Kudu Applications on the JVM

Categories: Kudu Testing

Although the Kudu server is written in C++ for performance and efficiency, developers can write client applications in C++, Java, or Python. To make it easier for Java developers to create reliable client applications, we’ve added new utilities in Kudu 1.9.0 that allow you to write tests using a Kudu cluster without needing to build Kudu yourself, without any knowledge of C++, and without any complicated coordination around starting and stopping Kudu clusters for each test.

Read more

Transparent Hierarchical Storage Management with Apache Kudu and Impala

Categories: CDH Impala Kudu Parquet

When picking a storage option for an applicationĀ it is common to pick a single storage option which has the most applicable features to your use case. For mutability and real-time analytics workloads you may want to use Apache Kudu, but for massive scalability at a low cost you may want to use HDFS. For that reason, there is a need for a solution that allows you to leverage the best features of multiple storage options.

Read more