Category Archives: Impala

Text Mining with Impala

Categories: Guest Impala Use Case

Thanks to Torsten Kilias and Alexander Löser of the Beuth University of Applied Sciences in Berlin for the following guest post about their INDREX project and its integration with Impala for integrated management of textual and relational data.

Textual data is a core source of information in the enterprise. Example demands arise from sales departments (monitor and identify leads), human resources (identify professionals with capabilities in ‘xyz’), market research (campaign monitoring from the social web),

Read More

Using Apache Parquet at AppNexus

Categories: Guest Impala Parquet Performance

Thanks to Chen Song, Data Team Lead at AppNexus, for allowing us to republish the following post about his company’s use case for Apache Parquet (incubating at this writing), the open standard for columnar storage across the Apache Hadoop ecosystem.

At AppNexus, over 2MM log events are ingested into our data pipeline every second. Log records are sent from upstream systems in the form of Protobuf messages. Raw logs are compressed in Snappy when stored on HDFS.

Read More

How-to: Install and Use Cask Data Application Platform Alongside Impala

Categories: How-to Impala Tools

Cloudera customers can now install, launch, and monitor CDAP directly from Cloudera Manager. This post from Nitin Motgi, Cask CTO, explains how.

Today, Cloudera and Cask are very happy to introduce the integration of Cloudera’s enterprise data hub (EDH) with the Cask Data Application Platform (CDAP). CDAP is an integrated platform for developers and organizations to build, deploy, and manage data applications on Apache Hadoop. This initial integration will enable CDAP to be installed,

Read More

How-to: Do Real-time Big Data Discovery using Cloudera Enterprise and Qlik Sense

Categories: Guest How-to Impala

Thanks to Jesus Centeno of Qlik for the post below about using Impala alongside Qlik Sense.

Cloudera and Qlik (which is part of the Impala Accelerator Program) have revolutionized the delivery of insights and value to every business stakeholder for “small data,” to something more powerful in the Big Data world—enabling users to combine Big Data and “small data” to yield actionable business insights.

In this post,

Read More

How-to: Use BIRT with Impala for Interactive Big Data Reporting

Categories: Guest How-to Impala

Thanks to Michael Williams, BIRT Product Evangelist & Forums Manager at analytics software specialist Actuate Corp. (now OpenText), for the guest post below. Actuate is the primary builder and supporter of BIRT, a top-level project of the Eclipse Foundation.

The Actuate (now OpenText) products BIRT Designer Professional and BIRT iHub allow you to connect to multiple data sources to create and deliver meaningful visualizations securely,

Read More