Category Archives: Platform Security & Cybersecurity

With Sentry, Cloudera Fills Hadoop’s Enterprise Security Gap

Categories: Hadoop Hive Impala Platform Security & Cybersecurity

Every day, more data, users, and applications are accessing ever-larger Apache Hadoop clusters. Although this is good news for data driven organizations overall, for security administrators and compliance officers, there are still lingering questions about how to enable end-users under existing Hadoop infrastructure without compromising security or compliance requirements.

While Hadoop has strong security at the filesystem level, it lacks the granular support needed to adequately secure access to data by users and BI applications.

Read more

How HiveServer2 Brings Security and Concurrency to Apache Hive

Categories: Hadoop Hive Platform Security & Cybersecurity

Apache Hive was one of the first projects to bring higher-level languages to Apache Hadoop. Specifically, Hive enables the legions of trained SQL users to use industry-standard SQL to process their Hadoop data.

However, as you probably have gathered from all the recent community activity in the SQL-over-Hadoop area, Hive has a few limitations for users in the enterprise space. Until recently, two in particular – concurrency and security – were largely unaddressed.

Read more

Cloudera Search over Apache HBase: A Story of Collaboration

Categories: HBase Platform Security & Cybersecurity

Thanks to Steven Noels, SVP of Products for NGDATA, for the guest post below.

NGDATA builds and sells Lily, the next-generation Customer Intelligence Platform that helps enterprise marketing teams collect and store customer interaction data in order to profile, segment, and present better offers. We designed Lily from the ground up to run on Apache HBase and Apache Solr. Combining these technologies with our deep marketing segmentation expertise and unique machine learning techniques we’re able to deliver interactive data management,

Read more

How-to: Set Up a Hadoop Cluster with Network Encryption

Categories: CDH Hadoop How-to Platform Security & Cybersecurity

Hadoop network encryption is a feature introduced in Apache Hadoop 2.0.2-alpha and in CDH4.1.

In this blog post, we’ll first cover Hadoop’s pre-existing security capabilities. Then, we’ll explain why network encryption may be required. We’ll also provide some details on how it has been implemented. At the end of this blog post, you’ll get step-by-step instructions to help you set up a Hadoop cluster with network encryption.

A Bit of History on Hadoop Security

Starting with Apache Hadoop 0.20.20x and available in Hadoop 1 and Hadoop 2 releases (as well as CDH3 and CDH4 releases),

Read more

How-to: Enable User Authentication and Authorization in Apache HBase

Categories: HBase How-to Platform Security & Cybersecurity

With the default Apache HBase configuration, everyone is allowed to read from and write to all tables available in the system. For many enterprise setups, this kind of policy is unacceptable. 

Administrators can set up firewalls that decide which machines are allowed to communicate with HBase. However, machines that can pass the firewall are still allowed to read from and write to all tables.  This kind of mechanism is effective but insufficient because HBase still cannot differentiate between multiple users that use the same client machines,

Read more