Since the launch of sparklyr, working with Apache Spark in Apache Hadoop has become much easier for R users. sparklyr contains a dplyr interface into Spark and allows users to leverage crucial machine learning algorithms from Spark MLlib and H2O Sparkling Water. This greatly reduces the barrier of entry for R users in adopting Spark as a tool for big data and should go a long way in enabling R workloads to migrate to Hadoop.
Get started with scalable graph analysis via simple examples that utilize GraphFrames and Spark SQL on HDFS.
Graphs—also known as “networks”—are ubiquitous across web applications. As a refresher, a graph consists of nodes and edges. A node can be any object, such as a person or an airport, and an edge is a relation between two nodes, such as a friendship or an airline connection between two cities.
Earlier this week, RStudio announced sparklyr, a new package that provides an interface between R and Apache Spark. We republish RStudio’s blog post below (see original) for your convenience.
Over the past couple of years we’ve heard time and time again that people want a native dplyr interface to Spark, so we built one! sparklyr also provides interfaces to Spark’s distributed machine learning algorithms and much more.
Can using simple statistical techniques in combination with big data help solve the Tamam Shud mystery?
Everyone loves a good real-life mystery. That’s why the three most popular TV shows of the 80s and 90s were Jack Palance’s reboot of Ripley’s Believe It or Not!, Unsolved Mysteries with Robert Stack, and Beyond Belief: Fact or Fiction hosted by Commander Riker.
The following post (Part 2 of two parts) by Vik Paruchuri, founder of data science learning platform Dataquest, offers some detailed and instructive insight about data science workflow (regardless of the tech stack involved, but in this case, using Python). We re-publish it here for your convenience.
Before we dive into exploring the data [see Part 1 for steps relating to data preparation], we’ll want to set the context,