Cloudera Operational Database is an operational database-as-a-service that brings ease of use and flexibility to Apache HBase. Cloudera Operational Database enables developers to quickly build future-proof applications that are architected to handle data evolution. In the previous blog posts, we looked at application development concepts and how Cloudera Operational Database (COD) interacts with other CDP […]
In the previous blog post, we looked at some of the application development concepts for the Cloudera Operational Database (COD). In this blog post, we’ll see how you can use other CDP services with COD. COD is an operational database-as-a-service that brings ease of use and flexibility to Apache HBase. Cloudera Operational Database enables developers […]
Cloudera Operational Database is now available in three different form-factors in Cloudera Data Platform (CDP). If you are new to Cloudera Operational Database, see this blog post. And, check out the documentation here. In this blog post, we’ll look at Apache HBase and Apache Phoenix concepts relevant to developing applications for Cloudera Operational Database. But […]
No, not really. You probably won’t be rich unless you work really hard… As nice as it would be, you can’t really predict a stock price based on ML solely, but now I have your attention! Continuing from my previous blog post about how awesome and easy it is to develop web-based applications backed by […]
In this last installment, we’ll discuss a demo application that uses PySpark.ML to make a classification model based off of training data stored in both Cloudera’s Operational Database (powered by Apache HBase) and Apache HDFS. Afterwards, this model is then scored and served through a simple Web Application. For more context, this demo is based […]
In this installment, we’ll discuss how to do Get/Scan Operations and utilize PySpark SQL. Afterward, we’ll talk about Bulk Operations and then some troubleshooting errors you may come across while trying this yourself. Read the first blog here. Get/Scan Operations Using Catalogs In this example, let’s load the table ‘tblEmployee’ that we made in the […]
When running any performance benchmarking tool on your cluster, a critical decision is always what data set size should be used for a performance test, and here we demonstrate why it is important to select a “good fit” data set size when running a HBase performance test on your cluster. The HBase cluster configurations and […]
The Cloudera Data Platform (CDP) is the latest Big Data offering from Cloudera. It includes Apache HBase and Phoenix as part of the platform. These two components are provided in 3 form-factors: For on-prem deployments, they are available in a manner similar to CDH & HDP (within the CDP Private Cloud offering) For customers that […]
We’re excited to share that after adding ANSI SQL, secondary indices, star schema, and view capabilities to Cloudera’s Operational Database, we will be introducing distributed transaction support in the coming months. What is ACID? The ACID model of database design is one of the most important concepts in databases. ACID stands for atomicity, consistency, isolation, […]
On September 24, 2019, Cloudera launched CDP Public Cloud (CDP-PC) as the first step in delivering the industry’s first Enterprise Data Cloud. That Was Then In the beginning, CDP ran only on AWS with a set of services that supported a handful of use cases and workload types: CDP Data Warehouse: a kubernetes-based service that […]