Tag Archives: developers

How-to: Use HUE’s Notebook App with SQL and Apache Spark for Analytics

Categories: How-to Hue Spark

This post from the HUE team about using HUE (the open source web GUI for Apache Hadoop), Apache Spark, and SQL for analytics was initially published in the HUE project’s blog.

Apache Spark is getting popular and HUE contributors are working on making it accessible to even more users. Specifically, by creating a Web interface that allows anyone with a browser to type some Spark code and execute it.

Read More

RecordService: For Fine-Grained Security Enforcement Across the Hadoop Ecosystem

Categories: Hadoop Impala Platform Security & Cybersecurity Sentry

This new core security layer provides a unified data access path for all Hadoop ecosystem components, while improving performance.

We’re thrilled to announce the beta availability of RecordService, a distributed, scalable, data access service for unified access control and enforcement in Apache Hadoop. RecordService is Apache Licensed open source that we intend to transition to the Apache Software Foundation. In this post, we’ll explain the motivation, system architecture,

Read More

Making Apache Spark Testing Easy with Spark Testing Base

Categories: Guest Spark

Thanks to Holden Karau (@holdenkarau), Software Engineer at Alpine Data Labs (also a Spark contributor and book author), for providing the following post about her work on new base classes for testing Apache Spark programs.

Testing in the world of Apache Spark has often involved a lot of hand-rolled artisanal code, which frankly is a good way to ensure that developers write as few tests as possible. I’ve been doing some work with Spark Testing Base (also available on Spark Packages) to try and make testing Spark jobs as easy as “normal”

Read More

Meet Cloudera’s Apache Spark Committers

Categories: Community General Meet the Engineer Spark

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready.

Recently,

Read More

What’s New in Cloudera Director 1.5?

Categories: Cloud

Cloudera Director 1.5 is now available; this post describes what’s inside, including a new open source plugin interface.

Cloudera Director is the manifestation of Cloudera’s commitment to providing a simple and reliable way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. With Cloudera Director 1.5, we continue the story of enabling production-ready clusters and big data applications by focusing on the following themes.

Read More