Tag Archives: Big Data

Informatica Big Data Management on Cloudera Altus

Categories: CDH Cloud

Today, we’re really excited to announce the latest innovation from Cloudera and Informatica’s partnership. Companies are increasingly moving their data operations into the cloud. With both companies focusing on helping customers derive business insights out of vast amounts of data, our new joint offering will dramatically simplify leveraging cloud-native infrastructures for big data analytics.

Last May, Cloudera announced Cloudera Altus, a new platform-as-a-service (PaaS) offering in the cloud for big data analytics,

Read more

Big Data Architecture Workshop

Categories: Training

Since the birth of big data, Cloudera University has been teaching developers, administrators, analysts, and data scientists how to use big data technologies. We have taught over 50,000 folks all of the details of using technologies from Apache such as HDFS, MapReduce, Hive, Impala, Sqoop, Flume, Kafka, Core Spark, Spark SQL, Spark Streaming, and Spark MLlib.

For administrators we’ve taught them how to plan, install, monitor, and troubleshoot clusters. For analysts we have shown them the power of SQL over large, diverse data sets.

Read more

Latest Impala Cookbook

Categories: Impala

Over the past year (and through several releases), Apache Impala (incubating) has added numerous new features and performance enhancements better enabling high-performance SQL analytics over big data.  Thus, it is time again for an update to the Impala cookbook, which contains best practices for these new features, updated guidelines, and more detailed examples.

Note: This cookbook does not yet capture best practices for the major new advancements available with the recent GA of Kudu.

Read more

Impala’s Next Step: Proposal to Join the Apache Software Foundation

Categories: Impala Kudu

The Impala project has already passed several important milestones on the way to its status as the leader and open standard for BI and SQL analytics on modern big data architecture. Today’s milestone is the submission of proposals for Impala and Kudu to join the Apache Software Foundation (ASF) Incubator.

[Update: Read the text of the Impala and Kudu proposals here and here, respectively.]

Since its initial release nearly five years ago,

Read more

How-to: Build a Complex Event Processing App on Apache Spark and Drools

Categories: HBase How-to Kafka Spark Use Case

Combining CDH with a business execution engine can serve as a solid foundation for complex event processing on big data.

Event processing involves tracking and analyzing streams of data from events to support better insight and decision making. With the recent explosion in data volume and diversity of data sources, this goal can be quite challenging for architects to achieve.

Complex event processing (CEP) is a type of event processing that combines data from multiple sources to identify patterns and complex relationships across various events.

Read more