Tag Archives: secondary sort

How-to: Tune Your Apache Spark Jobs (Part 1)

Categories: How-to Spark

Learn techniques for tuning your Apache Spark jobs for optimal efficiency.

When you write Apache Spark code and page through the public APIs, you come across words like transformation, action, and RDD. Understanding Spark at this level is vital for writing Spark programs. Similarly, when things start to fail, or when you venture into the web UI to try to understand why your application is taking so long,

Read more

Simple Moving Average, Secondary Sort, and MapReduce (Part 3)

Categories: General Hadoop

This is the final piece to a three part blog series. If you would like to view the previous parts to this series please use the following link:

Part 1 – A Simple Moving Average in Excel

Part 2 – A Simple Moving Average in R

Previously I explained how to use Excel and R as the analysis tools to calculate the Simple Moving Average of a small set of stock closing prices.

Read more

Simple Moving Average, Secondary Sort, and MapReduce (Part 2)

Categories: General Hadoop MapReduce

This is the second post of a three part blog series. If you would like to read “Part 1,” please follow this link. In this post we will be reviewing a simple moving average in contexts that should be familiar to the analyst not well versed in Hadoop as to establish a common ground with the reader from which we can move forward.

A Quick Primer on Simple Moving Average in Excel

Let’s take a second to do a quick review of how we define simple moving average in an Excel spreadsheet.

Read more

Simple Moving Average, Secondary Sort, and MapReduce (Part 1)

Categories: General Hadoop MapReduce

Intro

In this three part blog series I want to take a look at how we would do a Simple Moving Average with MapReduce and Apache Hadoop. This series is meant to show how to translate a common Excel or R function into MapReduce java code with accompanying working code and data to play with. Most analysts can take a few months of stock data and produce an excel spreadsheet that shows a moving average,

Read more