Tag Archives: dbinputformat

Introducing Sqoop

Categories: Data Ingestion General Hadoop Hive

In addition to providing you with a dependable release of Hadoop that is easy to configure, at Cloudera we also focus on developing tools to extend Hadoop’s usability, and make Hadoop a more central component of your data infrastructure. In this vein, we’re proud to announce the availability of Sqoop, a tool designed to easily import information from SQL databases into your Hadoop cluster.

Sqoop (“SQL-to-Hadoop”) is a straightforward command-line tool with the following capabilities:

  • Imports individual tables or entire databases to files in HDFS
  • Generates Java classes to allow you to interact with your imported data
  • Provides the ability to import from SQL databases straight into your Hive data warehouse

After setting up an import job in Sqoop,

Read more

Database Access with Apache Hadoop

Categories: General Hadoop MapReduce

Editor’s note (added Nov. 9. 2013): Valuable data in an organization is often stored in relational database systems. To access that data, you could use external APIs as detailed in this blog post below, or you could use Apache Sqoop, an open source tool (packaged inside CDH) that allows users to import data from a relational database into Apache Hadoop for further processing. Sqoop can also export those results back to the database for consumption by other clients.

Read more