Wednesday, May 30th, 2012
Intuit uses HBase for storing comprehensive, de-duplicated, canonical merchant information that powers the backend for a Merchant Lookup Service at Intuit. This service enables users and products to look up business details by various parameters like merchant name, location, and business type. It aims at providing a more complete, canonical business profile by bringing together data from across the various information providers including Intuit’s small business customer base. In this talk, we will describe the Hadoop deduping pipeline, our HBase data model, the challenges faced along the way and our plans to have upcoming projects leverage this data in HBase.