HBaseCon 2014 “Operations” track reveals best practices used by some of the world’s largest production-cluster operators.
- “From MongoDB to HBase in Six Easy Months”
Shreeganesh Ramanan and Mike Davis (Optimizely)
Pushing well past MongoDB’s limits (2TB data every week) is an interesting exercise in operational frustration. It also severely hampers flexibility of design for new use cases. This talk covers the architectural journey from MongoDB/Redis to HBase at Optimizely — including the performance, design flexibility, speed of implementation, and other gains made.
- “Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity”
Dheeraj Kapur, Rajiv Chittajallu & Anish Mathew (Yahoo!)
In early 2013, Yahoo! introduced multi-tenancy to HBase. A certain degree of customization per tenant (a user or a project) was achieved through RegionServer groups, namespaces, and customized configs for each tenant. This talk covers how to accommodate diverse needs to individual tenants on the cluster, as well as operational tips and techniques that allow Yahoo! to automate the management of multi-tenant clusters at petabyte scale without errors.
- “HBase Backups”
Jesse Yates (Salesforce.com), Demai Ni, Richard Ding & Jing Chen He (IBM)
This talk provides an overview of enterprise-scale backup strategies for HBase: Jesse Yates will describe how Salesforce.com runs backup and recovery on its multi-tenant, enterprise scale HBase deploys; Demai Ni, Richard Ding, and Jing Chen of the IBM InfoSphere BigInsights development team will then follow with a description of IBM’s recently open-sourced disaster/recovery solution based on HBase snapshots and replication.
- “Real-time HBase: Lessons from the Cloud”
Bryan Beaudreault (HubSpot)
Running HBase in real time in the cloud provides an interesting and ever-changing set of challenges — instance types are not ideal, neighbors can degrade your performance, and instances can randomly die in unanticipated ways. This talk will cover what HubSpot has learned about running in production on Amazon EC2, how it handles DR and redundancy, and the tooling the team has found to be the most helpful.
- “The State of HBase Replication”
Jean-Daniel Cryans (Cloudera)
HBase Replication has come a long way since its inception in HBase 0.89. Today, master-master and cyclic replication setups are supported; many bug fixes and new features like log compression, per-family peers configuration, and throttling have been added; and a major refactoring has been done. This presentation will recap the work done during the past four years, present a few use cases that are currently in production, and take a look at the roadmap.
- “Tales from the Cloudera Field”
Kevin O’Dell, Aleksandr Shulman & Kathleen Ting (Cloudera)
From supporting the 0.90.x, 0.92, 0.94, and 0.96 HBase installations on clusters ranging from tens to hundreds of nodes, Cloudera has seen it all. Having automated the upgrade paths from the different Apache releases, we have developed a smooth path that can help the community with upcoming upgrades. In addition to automation best practices, in this talk you’ll also learn proactive configuration tweaks and operational best practices to keep your HBase cluster always up and running.
- Smooth Operators Panel
Moderated by Eric Sammer (Cloudera)
Includes Jeremy Carroll (Pinterest), Adam Frank (Flurry), and Paul Tuckfield (Facebook).
Interested yet? If not, next week, we’ll offer a preview of the Features & Internals track.
Thank you to our sponsors — Continuuity, Hortonworks, Intel, LSI, MapR, Salesforce.com, Splice Machine, WibiData (Gold); BrightRoll, Facebook, Pepperdata (Silver); ASF (Community); O’Reilly Media, The Hive, NoSQL Weekly (Media) — without which HBaseCon would be impossible!