Tag Archives: Big Data

Gratuitous Hadoop: Stress Testing on the Cheap with Hadoop Streaming and EC2

Categories: General Hadoop

This post was contributed by Boris Shimanovsky, the Director of Engineering at Factual. Boris is responsible for managing all engineering functions and various data infrastructures at Factual- including the internal Cloudera’s Distribution for Apache Hadoop stack. He has been at Factual for over two years, and prior he was the CTO of XAP where he managed a team of +40 across multiple environments. He has an MS from UCLA in Computer Science.

Read more

Infochimp’s President, Philip Kromer, Interviewed Regarding Hadoop and Hadoop World

Categories: General

Excitement is building as Hadoop World nears and we are sitting down with some of our presenters to ask them a few questions regarding their presentations and how they are using Hadoop within their organization. Here we speak with Philip Kromer, President of Infochimps, who  answers  questions regarding his presentation, how Hadoop is used in his business, and what he aims to get out of Hadoop World. Philip’s presentation at Hadoop World is about the development of a data marketplace and commoditization,

Read more

Improving Hotel Search: Apache Hadoop @ Orbitz Worldwide

Categories: General Guest Hadoop Hive

This post was contributed by Jonathan Seidman from Orbitz. Jonathan is a Lead Engineer on the Intelligent Marketplace/Machine Learning team at Orbitz Worldwide . You can hear more from Jonathan at Hadoop World October 12th in NYC.

Orbitz Worldwide (NYSE:OWW) is composed of a global portfolio of online consumer travel brands including Orbitz, Cheaptickets, The Away Network, ebookers and HotelClub, Additionally, the company operates business-to-business service: Orbitz Worldwide Distribution provides third parties such as Amtrak,

Read more

CDH3b2 Release Recap

Categories: General

Just over a month ago, our CEO, Mike Olson, announced the availability of Cloudera’s Distribution for Hadoop (beta 2), or CDH3b2. As Charles, our head of Product Management, explained in a subsequent blog post, this release of CDH removes a lot of the complexity we’ve seen organizations encounter when deploying Hadoop within an existing data management infrastructure.

By packaging Hadoop core together with a suite of additional projects for data collection,

Read more

Upcoming webinar: Tackling Big Data Challenges with Vertica and Hadoop

Categories: General

Are your systems struggling to absorb ever-increasing amounts of data being generated daily? Are you mired in lengthy ETL processes preparing data for analysis? Are you forced to summarize information and thus losing important details along the way?

We’ve been working with Vertica at several large enterprise customers to solve these very problems. We’d like to invite you to  attend this free and exclusive webinar to hear our CTO,

Read more