Tag Archives: devops

How-to: Prepare Your Apache Hadoop Cluster for PySpark Jobs

Categories: CDH Hadoop How-to Spark

Proper configuration of your Python environment is a critical pre-condition for using Apache Spark’s Python API.

One of the most enticing aspects of Apache Spark for data scientists is the API it provides in non-JVM languages for Python (via PySpark) and for R (via SparkR). There are a few reasons that these language bindings have generated a lot of excitement: Most data scientists think writing Java or Scala is a drag,

Read More

HBaseCon 2015: Call for Papers and Early Bird Registration

Categories: Community Events HBase

HBaseCon 2015 is ON, people! Book Thursday, May 7, in your calendars.

If you’re a developer in Silicon Valley, you probably already know that since its debut in 2012, HBaseCon has been one of the best developer community conferences out there. If you’re not, this is a great opportunity to learn that for yourself: HBaseCon 2015 will occur on Thurs., May 7, 2015, at the Westin St. Francis on Union Square in San Francisco.

Read More

Doing DevOps with Cloudera Manager

Categories: Cloudera Manager General Ops and DevOps

More and more customers are using automation/configuration management frameworks alongside Cloudera Manager.

As Apache Hadoop clusters continue to grow in size, complexity, and business importance as the foundational infrastructure for an Enterprise Data Hub, the use cases for a robust and mature management console expand. 

Dev Ops

As those clusters become larger and more complex, many operators are looking to use configuration management/automation frameworks like Ansible,

Read More

HBase Training: Demystifying Real-Time Big Data Storage

Categories: HBase Training

We at Cloudera University have been busy lately, building and expanding our courses to help data professionals succeed. We’ve expanded the Hadoop Administrator course and created a new Data Analyst course. Now we’ve updated and relaunched our course on Apache HBase to help more organizations adopt Hadoop’s real-time Big Data store as a competitive advantage.

The course is designed to make sure developers and administrators with an HBase use case can start realizing value from day one.

Read More

How-to: Install Cloudera Manager and Cloudera Search with Ansible

Categories: Cloudera Manager Guest Ops and DevOps Search

The following guest post is re-published here courtesy of Gerd König, a System Engineer with YMC AG. Thanks, Gerd!

Cloudera Manager is a great tool to orchestrate your CDH-based Apache Hadoop cluster. You can use it from cluster installation, deploying configurations, restarting daemons to monitoring each cluster component. Starting with version 4.6, the manager supports the integration of Cloudera Search, which is currently in Beta state.

Read More