Hadoop World 2011: Data Ingestion, Egression, and Preparation for Hadoop


Wednesday, November 9th, 2011


One of the first challenges Hadoop developers face is accessing all the data they need and getting it into Hadoop for analysis. Informatica PowerExchange accesses a variety of data types and structures at different latencies (e.g. batch, real-time, or near real-time) and ingests data directly into Hadoop.  The next step is to parse the data in preparation for analysis in Hadoop.  Informatica provides a visual IDE to deploy pre-built parsers or design specific parsers for complex data formats and deploy them on Hadoop.  Once the analysis is complete,  Informatica PowerExchange delivers the resulting output to other information management systems such as a data warehouse.  Learn in this session from Informatica and one of their customers, how to get all the data you need into Hadoop, parse a variety of data formats and structures, and egress the resultant output to other systems.

Next Steps

Presentation Video