Tag Archives: security

New in Cloudera Enterprise 5.10: Hue SQL Editor and Security Improvements

Categories: Hadoop Hue Oozie

Cloudera Enterprise 5.10 includes the latest updates of Hue, the intelligent editor for SQL Developers and Analysts.

As part of Cloudera’s continuing investments in user experience and productivity, Cloudera Enterprise 5.10 includes an updated version of Hue. We provide a summary of the main enhancements in the following part of this blog post. (Hue from C5.10 is also available for a quick try in one click on demo.gethue.com.)

SQL Improvements

The Hue editor keeps getting better with these major improvements:

Row Count

The number of rows returned is displayed so you can quickly see the size of the dataset.

Read More

How to secure ‘Internet exposed’ Apache Hadoop

Categories: Hadoop How-to Platform Security & Cybersecurity

You may have heard of the recent (and ongoing) hacks targeting open source database solutions like MongoDB and Apache Hadoop. From what we know, an unknown number of hackers scanned for internet-accessible installations that had been set up using the default, non-secure configuration. Finding the exposure, these hackers then accessed the systems and in some cases deleted the files or held them for ransom.

These attacks were not technologically sophisticated,

Read More

New in Cloudera Enterprise 5.9: S3 Integration and SQL Editor Improvements

Categories: Hadoop Hue

Cloudera Enterprise 5.9 includes the latest release of Hue (3.11), the web UI that makes Apache Hadoop easier to use.

As part of Cloudera’s continuing investments in user experience and productivity, Cloudera Enterprise 5.9 includes a new release of Hue. Hue continues its focus on SQL and also now makes your interaction with the Cloud easier (Amazon S3 specifically in this first version). We’ll provide a summary of the main improvements in the following part of this blog post.

Read More

Impala’s Next Step: Proposal to Join the Apache Software Foundation

Categories: Impala Kudu

The Impala project has already passed several important milestones on the way to its status as the leader and open standard for BI and SQL analytics on modern big data architecture. Today’s milestone is the submission of proposals for Impala and Kudu to join the Apache Software Foundation (ASF) Incubator.

[Update: Read the text of the Impala and Kudu proposals here and here, respectively.]

Since its initial release nearly five years ago,

Read More

How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code

Categories: HBase How-to Search Spark

Learn how to use OCR tools, Apache Spark, and other Apache Hadoop components to process PDF images at scale.

Optical character recognition (OCR) technologies have advanced significantly over the last 20 years. However, during that time, there has been little or no effort to marry OCR with distributed architectures such as Apache Hadoop to process large numbers of images in near-real time.

In this post, you will learn how to use standard open source tools along with Hadoop components such as Apache Spark,

Read More