Cloudera Developer Blog · Use Case Posts
Spark is a compelling multi-purpose platform for use cases that span investigative, as well as operational, analytics.
Data science is a broad church. I am a data scientist — or so I’ve been told — but what I do is actually quite different from what other “data scientists” do. For example, there are those practicing “investigative analytics” and those implementing “operational analytics.” (I’m in the second camp.)
Cloudera’s own enterprise data hub is yielding great results for providing world-class customer support.
Here at Cloudera, we are constantly pushing the envelope to give our customers world-class support. One of the cornerstones of this effort is the Cloudera Support Interface (CSI), which we’ve described in prior blog posts (here and here). Through CSI, our support team is able to quickly reason about a customer’s environment, search for information related to a case currently being worked, and much more.
Thanks to Xavier Clements of Wajam for allowing us to re-publish his blog post about Wajam’s Hadoop experiences below!
Wajam is a social search engine that gives you access to the knowledge of your friends. We gather your friends’ recommendations from Facebook, Twitter, and other social platforms and serve these back to you on supported sites like Google, eBay, TripAdvisor, and Wikipedia.
Our thanks to Telvis Calhoun, Zach Hanif, and Jason Trost of Endgame for the guest post below about their BinaryPig application for large-scale malware analysis on Apache Hadoop. Endgame uses data science to bring clarity to the digital domain, allowing its federal and commercial partners to sense, discover, and act in real time.
In my previous post you learned how to index email messages in batch mode, and in near real time, using Apache Flume with MorphlineSolrSink. In this post, you will learn how to index emails using Cloudera Search with Apache HBase and Lily HBase Indexer, maintained by NGDATA and Cloudera. (If you have not read the previous post, I recommend you do so for background before reading on.)
Which near-real-time method to choose, HBase Indexer or Flume MorphlineSolrSink, will depend entirely on your use case, but below are some things to consider when making that decision:
Customer Spotlight: Learn How Edo Closes the Advertising Loop with Hadoop at Cloudera Sessions Milwaukee
The Cloudera Sessions fall series is coming to a close next week, but first we’ll make a final stop in Milwaukee, Wisconsin (on Oct. 17), where attendees will hear about edo — a company that is revolutionizing the advertising space by closing the loop between promotions and point-of-sale transactions.
In Milwaukee, edo CTO Jeff Sippel will engage in a fireside chat with Cloudera’s VP of marketing, Alan Saldich. At edo, Jeff is responsible for the strategy, planning, and execution for the systems — including Apache Hadoop — that power the edo offer platforms.
It’s common to hear people describe themselves as being “left-brained” or “right-brained” based on their tendency to be more logical and mathematically driven (left-brained), or, conversely, to be intuitive and creatively driven (right-brained). For example, people who prefer math over art are often considered left-brained. People who get a higher verbal score on their SATs than for math are often considered right-brained.
In general, language and creative writing are considered right-brained exercises. Many people also associate marketing and advertising as a right-brained function, whereas engineering is considered very left-brained.
In December 2012, we described how an internal application built on CDH called Cloudera Support Interface (CSI), which drastically improves Cloudera’s ability to optimally support our customers, is a unique and instructive use case for Apache Hadoop. In this post, we’ll follow up by describing two new differentiating CSI capabilities that have made Cloudera Support yet more responsive for customers:
Why would any company be interested in searching through its vast trove of email? A better question is: Why wouldn’t everybody be interested?
Email has become the most widespread method of communication we have, so there is much value to be extracted by making all emails searchable and readily available for further analysis. Some common use cases that involve email analysis are fraud detection, customer sentiment and churn, lawsuit prevention, and that’s just the tip of the iceberg. Each and every company can extract tremendous value based on its own business needs.
This week’s Cloudera Sessions roadshow will make it to Denver, Colo., on Thursday, where the customer Fireside Chat will feature Intelligent Software Solutions (ISS) Chief Architect of Global Enterprise Solutions, Wes Caldwell. ISS helps many government organizations – including several within the U.S. Department of Defense — deploy next-generation data management and analytic solutions using a combination of systems integration expertise and custom-built software.
During the Fireside Chat, Cloudera’s COO Kirk Dunn will engage Wes in a conversation to discuss the business use cases for Hadoop that ISS sees most often in the field, primarily within two buckets: batch analytics and real-time applications. Wes will also share his thoughts on some of the more recent innovations within the Apache Hadoop ecosystem, such as Cloudera Impala and Solr integrations.