When running any performance benchmarking tool on your cluster, a critical decision is always what data set size should be used for a performance test, and here we demonstrate why it is important to select a “good fit” data set size when running a HBase performance test on your cluster. The HBase cluster configurations and […]
The Cloudera Data Platform (CDP) is the latest Big Data offering from Cloudera. It includes Apache HBase and Phoenix as part of the platform. These two components are provided in 3 form-factors: For on-prem deployments, they are available in a manner similar to CDH & HDP (within the CDP Private Cloud offering) For customers that […]
We’re excited to share that after adding ANSI SQL, secondary indices, star schema, and view capabilities to Cloudera’s Operational Database, we will be introducing distributed transaction support in the coming months. What is ACID? The ACID model of database design is one of the most important concepts in databases. ACID stands for atomicity, consistency, isolation, […]
Replication (covered in this previous blog article) has been released for a while and is among the most used features of Apache HBase. Having clusters replicating data with different peers is a very common deployment, whether as a DR strategy or simply as a seamless way of replicating data between production/staging/development environments. Although it is […]
The Cloudera Operational Database (COD) is a managed dbPaaS solution available as an experience in Cloudera Data Platform (CDP). It offers multi-modal client access with NoSQL key-value using Apache HBase APIs and relational SQL with JDBC (via Apache Phoenix). The latter makes COD accessible to developers who are used to building applications that use MySQL, […]
In this blog post, we are going to take a look at some of the OpDB related security features of a CDP Private Cloud Base deployment. We are going to talk about encryption, authentication and authorization. Data-at-rest encryption Transparent data-at-rest encryption is available through the Transparent Data Encryption (TDE) feature in HDFS. TDE provides the […]
This blog post will present a simple “hello world” kind of example on how to get data that is stored in S3 indexed and served by an Apache Solr service hosted in a Data Discovery and Exploration cluster in CDP. For the curious: DDE is a pre-templeted Solr-optimized cluster deployment option in CDP, and recently […]
The Paycheck Protection Program (PPP) is implemented by the US federal government to provide a direct incentive for businesses to keep their employees on the payroll, particularly during the Covid-19 pandemic. PPP assists qualified businesses retain their workforce as well as help pay for related business expenses. Data from the US Treasury website show which […]
Cloudera Data Platform (CDP) Private Cloud is the most comprehensive on-premises platform for integrated analytics and data management. It combines the best of Cloudera Enterprise Data Hub and Hortonworks Data Platform Enterprise Plus, and brings the latest and greatest open source technologies for data management and analytics to the data center. With the latest version (7) […]
Navistar is a leading global manufacturer of commercial trucks. With a fleet of 350,000 vehicles, unscheduled maintenance and vehicle breakdowns created ongoing disruption to their business. Navistar required a diagnostics platform that would help them predict when a vehicle needed maintenance to minimize downtime. This platform needed to be able to collect, analyze and serve […]