Hadoop Summit 2012 | A New Generation of Data Transfer Tools for Hadoop: Sqoop 2


Monday, June 18th, 2012


Apache Sqoop (incubating) was created to efficiently transfer big data between Hadoop related systems (such as HDFS, Hive, and HBase) and structured data stores (such as relational databases, data warehouses, and NoSQL systems). The popularity of Sqoop in enterprise systems confirms that Sqoop does bulk transfer admirably. In the meantime, we have encountered many new challenges that have outgrown the abilities of the current infrastructure. To fulfill more data integration use cases as well as become easier to manage and operate, a new generation of Sqoop, also known as Sqoop 2, is currently undergoing development to address several key areas, including ease of use, ease of extension, and security. This session will talk about Sqoop 2 from both the development and operations perspectives.

Next Steps

Cloudera's Distribution Including Apache Hadoop 4