Apache Hive Archives | Page 5 of 7

December 27, 2018 | Technical

Apache Hive Warehouse Connector Use-Cases

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. 1. Motivation The HiveWarehouseConnector (HWC) is an open-source library which provides new interoperability capabilities between Hive and Spark. In practice, Hive and Spark are often leveraged together by companies to provide a scalable […]

by Eric Wohlstadter 5 min read

Apache Hive

December 21, 2018 | Technical

Faster Swarms of Data : Accelerating Hive Queries with Parquet Vectorization

Background Apache Hive is a widely adopted data warehouse engine that runs on Apache Hadoop. Features that improve Hive performance can significantly improve the overall utilization of resources on the cluster. Hive processes data using a chain of operators within the Hive execution engine. These operators are scheduled in the various tasks (for example, MapTask, […]

by Cloudera , santosh Kumar , Haifeng Chen , Cheng Xu , Wang Lifeng 5 min read

December 20, 2018 | Technical

Query Federation with Apache Hive

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Organizations commonly use a plethora of data storage and processing systems today. These different systems offer cost-effective performance for their respective use cases. Besides traditional RDBMSs such as Oracle DB, Teradata, or PostgreSQL, […]

by Cloudera 5 min read

Apache Hive Data Warehouse

December 19, 2018 | Technical

Introducing Hive-Kafka integration for real-time Kafka SQL queries

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Our last few blogs as part of the Kafka Analytics blog series focused on the addition of Kafka Streams to HDP and HDF and how to build, secure, monitor Kafka Streams apps / […]

by Cloudera , Slim Bougerra 5 min read

Apache Hive Apache Kafka

December 18, 2018 | Technical

Big Data Processing Engines – Which one do I use?: Part 1

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Special thanks to Bill Preachuk and Brandon Wilson for reviewing and providing their expertise Introduction Columnar storage is an often-discussed topic in the big data processing and storage world today – there are […]

by Cloudera 9 min read

Apache Druid Apache Hadoop Apache HBase Apache Hive Apache Phoenix Hortonworks Data Platform

October 1, 2018 | Technical

Benchmark Update: Apache Hive and Druid Integration in HDP 3.0

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Earlier we talked about reasons for integrating Druid and Hive in a THREE-PART SERIES (Part 1, Part 2 , Part 3) OF DOING ULTRA FAST OLAP ANALYTICS WITH APACHE HIVE AND DRUID. Since […]

by Cloudera 3 min read

Apache Druid Apache Hadoop Apache Hive

June 12, 2018 | Technical

Accelerating Apache Hadoop 3.1 based distribution: Analyzing the Right Data at the Right Time

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Guest Author: Greg Kincade, MBA, is an electrical engineer and Sr. Ecosystem Enablement Program Manager for the Micron Storage Solutions Center. We used to build data lakes. Now we fill data oceans. As […]

by Cloudera 5 min read

Apache Hadoop Apache Hive Hortonworks Data Platform

April 5, 2018 | Technical

Cloud Architectures for Interactive Analytics with Apache Hive

Enterprises are increasingly moving portions or entire datacenters to the cloud in order to minimize their physical footprint, minimize operational overhead, and shorten their infrastructure acquisition cycles. An incidental benefit is that cloud services, like cloud-based object storage, bring a new set of tools to a Hadoop architect. At Hortonworks, our customers use a number […]

by Brandon Wilson , Gopal Vijayaraghavan 3 min read

Apache HDFS Apache Hive Cloud Customer Analytics

February 20, 2018 | Technical

Hortonworks DataFlow (HDF) 3.1 blog series part 5: Introducing Apache NiFi-Atlas integration

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Two weeks ago, we announced the GA of HDF 3.1, and to share more details about this milestone release we started the HDF 3.1 Blog Series. In this installment of the series, we’ll […]

by Cloudera Community 5 min read

Apache Atlas Apache Hive Apache NiFi

September 14, 2017 | Technical

Benchmark Apache HBase vs Apache Cassandra on SSD in a Cloud Environment

This blog post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be accurate. Overview As more and more workloads are being brought onto modern hardware in the cloud, it’s important for us to understand how to pick the best databases that can leverage the best hardware. […]

by Cloudera 3 min read

Apache Hadoop Apache HBase Apache Hive

Filter By