Performance is one of the key, if not the most important deciding criterion, in choosing a Cloud Data Warehouse service. In today’s fast changing world, enterprises have to make data driven decisions quickly and for that they rely heavily on their data warehouse service. In this blog post, we compare Cloudera Data Warehouse (CDW) on […]
Aren’t two superheroes better than one? Some of the most powerful results come from combining complementary superpowers, and the “dynamic duo” of Apache Hive LLAP and Apache Impala, both included in Cloudera Data Warehouse, is further evidence of this. Both Impala and Hive can operate at an unprecedented and massive scale, with many petabytes of […]
CDP for Azure introduces fine-grained authorization for access to Azure Data Lake Storage using Apache Ranger policies. Cloudera and Microsoft have been working together closely on this integration, which greatly simplifies the security administration of access to ADLS-Gen2 cloud storage. Apache Ranger provides a centralized console to manage authorization and view audits of access to […]
Service Management Group (SMG) offers an easy-to-use experience management (XM) platform that combines end-to-end customer and employee experience management software with hands-on professional services to deliver actionable insights and help brands get smarter about their customers. The XM platform, smg360, helps customers across verticals, including restaurants, retail, and healthcare, drive changes that boost loyalty and […]
The Paycheck Protection Program (PPP) is implemented by the US federal government to provide a direct incentive for businesses to keep their employees on the payroll, particularly during the Covid-19 pandemic. PPP assists qualified businesses retain their workforce as well as help pay for related business expenses. Data from the US Treasury website show which […]
Apache Hive supports transactional tables which provide ACID guarantees. There has been a significant amount of work that has gone into hive to make these transactional tables highly performant. Apache Spark provides some capabilities to access hive external tables but it cannot access hive managed tables. To access hive managed tables from spark Hive Warehouse […]
In the lifecycle of a data warehouse in production, there are a variety of tasks that need to be executed on a recurring basis. To name a few concrete examples, scheduled tasks can be related to data ingestion (inserting data from a stream into a transactional table every 10 minutes), query performance (refreshing a materialized […]
Editor’s Note, August 2020: CDP Data Center is now called CDP Private Cloud Base. You can learn more about it here. Cloudera Data Platform (CDP) Data Center(DC) is the on-premises release of Cloudera Data Platform. CDP DC combines the best services and components from Cloudera Enterprise Data Hub and Hortonworks Data Platform Enterprise along with […]
Apache HBase became a top-level project with Apache 10 years ago and Cloudera began contributing to it at the same time (2010). Over this time, it has become one of the largest and most popular open-source tools in big data and one of the most popular NoSQL databases. The Apache Software Foundation Announces the 10th […]
When speaking with customers, I often hear that they are committed to digital transformation and being a data-driven enterprise. Those may just seem like abstract, lofty words to aspire to but the reality is much more practical. We have major banks needing to ensure that they have a complete view of their customers, and can […]