HBaseCon 2016 will occur on May 24, 2016, at The Village in San Francisco.
HBaseCon is back, and CfP and Early Bird registration are both open for business.
Now in its fifth year, HBaseCon is the premier community event for Apache HBase contributors, developers, admins, and users of all skill levels. The event is hosted and organized by Cloudera, with a Program Committee reflecting a cross-section of the HBase community (including employees of Bloomberg LP,
Learn how improve Apache HBase usability by creating a custom formatter for viewing binary data types in the HBase shell.
Cloudera customers are looking to store complex data types in Apache HBase to provide fast retrieval of complex information such as banking transactions, web analytics records, and related metadata associated with those records. Serialization formats such as Apache Avro, Thrift, and Protocol Buffers greatly assist in meeting this goal,
(Ed. Note [Dec. 4 2015]: The post below has been edited to reflect the new release of Phoenix 4.5.2 for CDH 5.5.x.)
New Cloudera Labs packages for Apache Phoenix 4.5.2 (which includes Apache Spark integration) is now available for CDH 5.4.x and CDH 5.5.x.
Earlier this year, Cloudera announced the inclusion of Apache Phoenix in Cloudera Labs.
To recap: Phoenix adds SQL to Apache HBase,
Combining CDH with a business execution engine can serve as a solid foundation for complex event processing on big data.
Event processing involves tracking and analyzing streams of data from events to support better insight and decision making. With the recent explosion in data volume and diversity of data sources, this goal can be quite challenging for architects to achieve.
Complex event processing (CEP) is a type of event processing that combines data from multiple sources to identify patterns and complex relationships across various events.
Learn how to use OCR tools, Apache Spark, and other Apache Hadoop components to process PDF images at scale.
Optical character recognition (OCR) technologies have advanced significantly over the last 20 years. However, during that time, there has been little or no effort to marry OCR with distributed architectures such as Apache Hadoop to process large numbers of images in near-real time.
In this post, you will learn how to use standard open source tools along with Hadoop components such as Apache Spark,