Tuesday, November 8th, 2011
Jive is using Flume to deliver the content of a social web (250M messages/day) to HDFS and HBase. Flume’s flexible architecture allows us to stream data to our production data center as well as Amazon’s Web Services datacenter. We periodically build and merge Lucene indices with Hadoop jobs and deploy these to Katta to provide near real time search results. This talk will explore our infrastructure and decisions we’ve made to handle a fast growing set of real time data feeds. We will further explore other uses for Flume throughout Jive including log collection and our distributed event bus.