Tag Archives: Big Data

New in Cloudera Data Science Workbench 1.2: Usage Monitoring for Administrators

Categories: CDH Cloudera Data Science Workbench Data Science Performance

Cloudera Data Science Workbench (CDSW) provides data science teams with a self-service platform for quickly developing machine learning workloads in their preferred language, with secure access to enterprise data and simple provisioning of compute. Individuals can request schedulable resources (e.g. compute, memory, GPUs) on a shared cluster that is managed centrally.

While self-service provisioning of resources is critical to the rapid interaction cycle of data scientists, it can pose a challenge to administrators.

Read more

Informatica Big Data Management on Cloudera Altus

Categories: CDH Cloud

Today, we’re really excited to announce the latest innovation from Cloudera and Informatica’s partnership. Companies are increasingly moving their data operations into the cloud. With both companies focusing on helping customers derive business insights out of vast amounts of data, our new joint offering will dramatically simplify leveraging cloud-native infrastructures for big data analytics.

Last May, Cloudera announced Cloudera Altus, a new platform-as-a-service (PaaS) offering in the cloud for big data analytics,

Read more

Big Data Architecture Workshop

Categories: Training

Since the birth of big data, Cloudera University has been teaching developers, administrators, analysts, and data scientists how to use big data technologies. We have taught over 50,000 folks all of the details of using technologies from Apache such as HDFS, MapReduce, Hive, Impala, Sqoop, Flume, Kafka, Core Spark, Spark SQL, Spark Streaming, and Spark MLlib.

For administrators we’ve taught them how to plan, install, monitor, and troubleshoot clusters. For analysts we have shown them the power of SQL over large, diverse data sets.

Read more

Latest Impala Cookbook

Categories: Impala

Over the past year (and through several releases), Apache Impala (incubating) has added numerous new features and performance enhancements better enabling high-performance SQL analytics over big data.  Thus, it is time again for an update to the Impala cookbook, which contains best practices for these new features, updated guidelines, and more detailed examples.

Note: This cookbook does not yet capture best practices for the major new advancements available with the recent GA of Kudu.

Read more

Impala’s Next Step: Proposal to Join the Apache Software Foundation

Categories: Impala Kudu

The Impala project has already passed several important milestones on the way to its status as the leader and open standard for BI and SQL analytics on modern big data architecture. Today’s milestone is the submission of proposals for Impala and Kudu to join the Apache Software Foundation (ASF) Incubator.

[Update: Read the text of the Impala and Kudu proposals here and here, respectively.]

Since its initial release nearly five years ago,

Read more