The second Apache Hadoop HDFS and MapReduce contributors meeting was held last Friday, May 28 at Cloudera’s offices in Palo Alto. Apache projects attract contributors from across the globe, and Hadoop is no exception, so the idea of holding face-to-face meetings may seem to run counter to the existence such a highly decentralized organization. However, the point of in-person meetings is not to make project decisions, but rather to start discussions that spur more in-depth,
We recently met with a customer at Cloudera’s new offices and asked if he had any specific use cases in mind for the Apache Hadoop cluster that we are helping him to roll out. He replied, quite honestly, that he didn’t know. His baseline understanding is that there is value in the data that his organization is collecting today, but he’s not sure where it is. He said, “I would like to have all of this data stored forever”
Last week, several Cloudera employees attended the Bay Area HBase User Group #9, kindly hosted by Mozilla at their headquarters in Mountain View. About 80 people attended, and it was a great chance to get together with the whole HBase community. I got a chance to chat with some community members who have been running HBase in their organizations for quite some time, and also several who are just beginning to investigate the project for new and exciting projects within their businesses.
At Cloudera, we’re always working to make it easier for you to work with Hadoop and integrate Hadoop-based systems in with your existing data sources. One example of how we accomplish this is Sqoop, a database import tool developed at Cloudera that allows you to easily copy data between databases and HDFS. We originally announced this tool in June, but we’ve been steadily improving it since then. It can now talk with several more databases than before,