Tag Archives: Big Data

Improving Hotel Search: Apache Hadoop @ Orbitz Worldwide

Categories: General Guest Hadoop Hive

This post was contributed by Jonathan Seidman from Orbitz. Jonathan is a Lead Engineer on the Intelligent Marketplace/Machine Learning team at Orbitz Worldwide . You can hear more from Jonathan at Hadoop World October 12th in NYC.

Orbitz Worldwide (NYSE:OWW) is composed of a global portfolio of online consumer travel brands including Orbitz, Cheaptickets, The Away Network, ebookers and HotelClub, Additionally, the company operates business-to-business service: Orbitz Worldwide Distribution provides third parties such as Amtrak,

Read more

CDH3b2 Release Recap

Categories: General

Just over a month ago, our CEO, Mike Olson, announced the availability of Cloudera’s Distribution for Hadoop (beta 2), or CDH3b2. As Charles, our head of Product Management, explained in a subsequent blog post, this release of CDH removes a lot of the complexity we’ve seen organizations encounter when deploying Hadoop within an existing data management infrastructure.

By packaging Hadoop core together with a suite of additional projects for data collection,

Read more

Upcoming webinar: Tackling Big Data Challenges with Vertica and Hadoop

Categories: General

Are your systems struggling to absorb ever-increasing amounts of data being generated daily? Are you mired in lengthy ETL processes preparing data for analysis? Are you forced to summarize information and thus losing important details along the way?

We’ve been working with Vertica at several large enterprise customers to solve these very problems. We’d like to invite you to  attend this free and exclusive webinar to hear our CTO,

Read more

Integrating Apache Hive and Apache HBase

Categories: Guest HBase Hive

This post was contributed by John Sichi, a committer on the Apache Hive project and a member of the Data Infrastructure team at Facebook.

As many readers may already know, Hive was initially developed at Facebook for dealing with explosive growth in our multi-petabyte data warehouse.  Since its release as an Apache project, it has been put into use at a number of other companies for solving big data problems. Read more