Cloudera’s open source licensing policies have evolved with the changing dynamics in open source innovation. For more information on Cloudera’s current policy, please contact OSSQuestions@cloudera.com. We are now well into 2022 and the megatrends that drove the last decade in data—The Apache Software Foundation as a primary innovation vehicle for big data, the arrival of […]
Many customers looking at modernizing their pipeline orchestration have turned to Apache Airflow, a flexible and scalable workflow manager for data engineers. With 100s of open source operators, Airflow makes it easy to deploy pipelines in the cloud and interact with a multitude of services on premise, in the cloud, and across cloud providers for […]
Across the federal government, agencies are struggling to identify, organize, analyze, and act on troves of data. It’s a problem that leaders are working actively to tackle, but they’re in a race against immeasurable volumes of data that is continuously being generated in perpetuity in stores known and unknown. At the Internal Revenue Service, decades’ […]
Apache Ozone is a scalable distributed object store that can efficiently manage billions of small and large files. Ozone natively provides Amazon S3 and Hadoop Filesystem compatible endpoints in addition to its own native object store API endpoint and is designed to work seamlessly with enterprise scale data warehousing, machine learning and streaming workloads. The […]
When we announced the GA of Cloudera Data Engineering back in September of last year, a key vision we had was to simplify the automation of data transformation pipelines at scale. By leveraging Spark on Kubernetes as the foundation along with a first class job management API many of our customers have been able to […]
Follow your data in object storage on-premises As businesses look to scale-out storage, they need a storage layer that is performant, reliable and scalable. With Apache Ozone on the Cloudera Data Platform (CDP), they can implement a scale-out model and build out their next generation storage architecture without sacrificing security, governance and lineage. CDP integrates […]
After the launch of CDP Data Engineering (CDE) on AWS a few months ago, we are thrilled to announce that CDE, the only cloud-native service purpose built for enterprise data engineers, is now available on Microsoft Azure. CDP Data Engineering offers an all-inclusive toolset that enables data pipeline orchestration, automation, advanced monitoring, visual profiling, and […]
CDP Operational Database Experience (COD) is a PaaS offering on the Cloudera Data Platform (CDP). COD enables you to create a new operational database with a few clicks and auto-scales based on your workload. Behind the scenes, COD automatically manages cluster deployment and configuration, reducing overheads related to setting up new database instances. Additionally, auto-scaling […]