Category Archives: Sqoop

Cloudera Connector for Teradata 1.0.0

Categories: Sqoop

Apache Sqoop (incubating) provides an efficient approach for transferring big data between Hadoop related systems (such as HDFS, Hive, and HBase) and structured data stores (such as relational databases, data warehouses, and NoSQL systems). The extensible architecture used by Sqoop allows support for a data store to be added as a so-called connector. By default, Sqoop comes with connectors for a variety of databases such as MySQL, PostgreSQL, Oracle, SQL Server, and DB2.

Read more

Biodiversity Indexing: Migration from MySQL to Apache Hadoop

Categories: Community Guest Hadoop Oozie Sqoop


This post was contributed by The Global Biodiversity Information Facility development team.

The Global Biodiversity Information Facility is an international organization, whose mission is to promote and enable free and open access to biodiversity data worldwide. Part of this includes operating a search, discovery and access system, known as the Data Portal; a sophisticated index to the content shared through GBIF. This content includes both complex taxonomies and occurrence data such as the recording of specimen collection events or species observations.

Read more