Author Archives: Sean Owen

Why Apache Spark is a Crossover Hit for Data Scientists

Categories: Data Science Spark Use Case

Spark is a compelling multi-purpose platform for use cases that span investigative, as well as operational, analytics.

Data science is a broad church. I am a data scientist — or so I’ve been told — but what I do is actually quite different from what other “data scientists” do. For example, there are those practicing “investigative analytics” and those implementing “operational analytics.” (I’m in the second camp.)

Data scientists performing investigative analytics use interactive statistical environments like R to perform ad-hoc,

Read more

Myrrix Joins Cloudera to Bring "Big Learning" to Hadoop

Categories: Data Science Hadoop Mahout

What a short, strange trip it’s been. Just a year ago, I founded Myrrix in London’s Silicon Roundabout to commercialize large-scale machine learning based on Apache Hadoop and Apache Mahout. It’s been a busy scramble, building software and proudly watching early customers get real, big data-sized machine learning into production.

And now another beginning: Myrrix has a new home in Cloudera. I’m excited to join as Director of Data Science in London,

Read more