HBaseCon 2013: "Ecosystem" Track Preview
Unbelievably, HBaseCon 2013 is only one week away (June 13 in San Francisco)!
Today we bring you a preview of the Ecosystems track, a grand tour (in 20-minute increments) of the fascinating current work being done across the community to extend or build on top of Apache HBase.
- Impala: Using SQL to Extract Value from Apache HBase
Elliott Clark, Cloudera
Cloudera Impala is an open source project that allows low latency and analytical queries over big data in Apache Hadoop; with Impala it is now possible to use SQL in conjunction with HBase.
- HBase SEP: Reliable Maintenance of Auxiliary Index Structures
Steven Noels, NGDATA
In this talk, we will present HBase SEP (Side-Effects Processor) and Indexer, two new open source projects that provide a reliable bridge between HBase and index systems but cater to the needs of anyone who wants to keep auxiliary data in lockstep sync with HBase updates.
- SQL Over HBase: A Case for Apache Hive
Enis Söztutar & Ashtutosh Chahaun, Hortonworks
In this talk we will look at the current status of using Hive for querying your data stored in HBase. The talk will include a running example of a web table storing web crawl data in HBase, and Hive queries to that table for analysis.
- How (and Why) Phoenix Puts the SQL Back into NoSQL
James Taylor, Salesforce.com
Phoenix is an open source project from Salesforce.com that puts a SQL skin on top of HBase. This talk will focus on answering: 1) why put a SQL skin on top of HBase? and 2) how does Phoenix marry the SQL paradigm with NoSQL?
- Apache Drill: A Community-driven Initiative to Deliver ANSI SQL Capabilities for Apache HBase
Jacques Nadeau, MapR
This session provides an overview of Apache Drill, which will deliver full ANSI SQL capability for HBase users.
- Honeycomb: MySQL Backed by Apache HBase
Dan Burkert, Near Infinity
Honeycomb is an exciting new open source storage engine plugin for MySQL that enables MySQL to store and query tables directly in HBase.
- Using Coprocessors to Index Columns in an Elasticsearch Cluster
Dibyendu Bhattacharya, HappiestMinds
This presentation explores the design and challenges HappiestMinds faced while implementing a storage and search infrastructure for a large publisher where books/documents/artifacts related records are stored in Apache HBase.
- Full-Text Indexing for Apache HBase
Maryann Xue, Intel
Intel has extended HBase with a general full-text indexing framework based on Apache Lucene, which supports distributed search for any combination of words or phrases of interest in data stored within HBase.
- Streaming Data into Apache HBase using Apache Flume: Experience with High-Speed Writes
Hari Shreedharan, Cloudera
In this talk, we discuss the lessons we learned while using the standard and async API, retrying puts and increments, and fine tuning batches to make sure we get optimum performance with minimal number of duplicates.
- Using Metrics to Monitor and Debug Apache HBase
Elliott Clark, Cloudera
In this session we will talk about the metrics exposed by HBase. We’ll cover what metrics are there, what they mean, and how to access them.
- High-Throughput, Transactional Stream Processing on Apache HBase
Andreas Neumann & Alex Baranau, Continuuity
We have developed the Continuuity Data Fabric as a unified, transactional queuing and storage engine. This talk will discuss its implementation on top of HBase, evaluate performance, scalability and reliability, and share experiences, best practices, and lessons learned.
- Real-Time Model Scoring in Recommender Systems
Jon Natkins & Juliet Hougland, WibiData
In this presentation, we’ll discuss how developers can use Apache HBase and Kiji to develop low-latency predictive models, using algorithms like clustering or collaborative filtering, and how to leverage those models in the context of a full application.
- Using Apache HBase for Large Matrices
Gokhan Capan, Dilisim
In this talk, we describe HBase-backed versions of Mahout matrices that allow us to access and manipulate matrix elements easily, perform common matrix operations, and input persistent matrices to existing machine learning algorithms.
- Project Valta: A Resource Management Layer over Apache HBase
Lars George & Andrew Wang, Cloudera
Valta is an open-source project that acts as a layer between the user and the HBase API, employing client and server side mechanisms to guard precious resources.
Interested yet? If not, next week, we’ll offer a sneak-peek of the Case Studies track.