The following article by Ciaran Dynes was reposted from the Talend blog with their permission.
As you may have read, Talend recently announced its support for Cloudera Altus, a newly released Platform-as-a-Service (PaaS) offering that simplifies running large-scale data processing applications in the public cloud. For us, supporting Altus at launch was the absolute easiest decision given that so many of our customers are looking to realize the cost,
Learn how to use Cloudera to spin up Apache Hadoop clusters across multiple cloud providers to take advantage of competing prices and avoid infrastructure lock-in.
Why is a multi-cloud strategy important?
In the early days of Cloudera, it was a fair assumption that our software would be running on industry-standard servers that were purchased, owned, and operated by the client in their own data center. In the last few years,
With modern businesses dealing with an ever-increasing volume of data, and an expanding set of data sources, the data engineering process that enables analysis, visualization, and reporting only becomes more important.
When considering running data engineering workloads in the public cloud, there are capabilities which enable different operational models from on-premises deployments. The key factors here are the presence of a distinct storage layer within the cloud environment, and the ability to provision compute resources on-demand (e.g.: with Amazon’s S3 and EC2 respectively).
Cloudera Director 2.4 improves support for long-running clusters by syncing with upgrades and topology changes via Cloudera Manager, and adds support for Spark 2 and Kudu. Cloudera Director along with CM and CDH5.11 adds support for Microsoft Azure Data Lake Store (ADLS), and pausing of clusters with Amazon EBS volumes.
Cloudera Director helps you deploy, scale, and manage Apache Hadoop clusters in the cloud of your choice.
Cloudera Enterprise 5.11 is Now Available
Cloudera is pleased to announce that Cloudera Enterprise 5.11 is now generally available (GA). The highlights of this release include lineage support for Apache Spark, Apache Kudu security integration, embedded data discovery for self-service BI, and new cloud capabilities for Microsoft ADLS and Amazon S3.
As usual, there are also a number of quality enhancements, bug fixes, and other improvements across the stack. Here is a partial list of what’s included (see the Release Notes for a full list):
- Core Platform and Cloud
- Amazon S3 Consistency: S3Guard ensures that operations on Amazon S3 are immediately visible to other clients,