Making a decision on a cloud data warehouse is a big deal. Beyond there being a number of choices each with very different strengths, the parameters for your decision have also changed. Modernizing your data warehousing experience with the cloud means moving from dedicated, on-premises hardware focused on traditional relational analytics on structured data to a modern platform. To be successful, additional requirements you need to consider are:
- Your modern data warehouse should take all your data into account, structured and unstructured.
- You need to take advantage of the promise of the cloud- greater flexibility and lower costs, without losing the governance and security you’ve built over the years.
- When you migrate to the cloud, you want to gain agility through real measurable improvements in your analytics development projects
- You need to have the best price/performance to optimize your cost management
- You want to partner with a rising star in the market that will continue to invest and innovate in ways that drive towards an even better future.
With all these considerations required to be successful in making a decision about your cloud data warehouse, it can be difficult to find the right sources to help give guidance. To help, we’ve compiled three great analyst reports which highlight some of these core concepts.
- The Forrester WaveTM: Cloud Data Warehouse, Q1 2021 which helps rank and measure providers that matter most and how they measure up in 26 criteria
- Cloud Data Warehouse Performance Testing by GigaOM which tests 5 cloud data warehouses against an industry standard benchmark to see which cloud data warehouse offered the best price/performance
- The Value Of Data Modernization With Cloudera by Nucleus Research which shares the results of their interviews with some Cloudera customers, and the increased agility they experienced in their development teams
These three reports, when taken together, showcase the multifaceted world of cloud data warehousing, and some very important considerations to take into account when selecting a vendor to partner with. If you want to get the value promised by the cloud, you must make sure the solution can deliver agility and value in real world scenarios. If you want to optimize your cost management, make sure you take into account not just the price/hour of a cloud solution, but also how well it runs your workloads, looking for the price/performance lead as reported by GigaOM. You also want to be sure your data warehouse isn’t locked to today, but is part of a growing, enterprise data cloud.
In the report from Nucleus, The Value Of Data Modernization With Cloudera, specific customers using the Cloudera Data Warehouse experienced some significant return on investment. One customer reported an 88 percent reduction in script runtime, another reported 75 percent reduction in time spent coding, and a third customer reported 47 percent reduction in the development lifecycle. Agility figures like these are one of the biggest reasons why we turn to the promise of the cloud – we can get more analytics into more decision makers hands in less time. This will of course translate into more options to explore data, leading to faster, better and more business insights.
In the report from GigaOM, William McKnight used the latest industry-standard TPC Benchmark™ DS (TPC-DS) benchmark. This benchmark represents 99 different queries that represent a modern decision support system’s variety of complex and challenging workloads. Why this matters is because when we move from fixed price on-premise cluster cost evaluations to a consumption based cloud one, the equations for total cost of ownership and cost management fundamentally change.
In a cloud world – you can pay more per hour for a service, yet still save costs if that more expensive service is able to perform work at a much faster rate. You would pay an expert more per hour if they can get the work done in less time overall, why should we not also look at our cloud data warehouse using the same lens?
The GigaOM report shows that Cloudera’s performance, together with its competitive price/hour, makes it the cheapest cloud data warehouse when running the benchmark queries against a 30TB data set. While this doesn’t guarantee the exact same results in your workloads, it does raise the point that in order to be sure you have the best chance of optimizing cost management, you should consider how well your cloud data warehouse can execute the workloads you need from it, and find the one that has excellent price/performance on your data, not just best price/hour overall.
In The Forrester WaveTM: Cloud Data Warehouse, Q1 2021, Cloudera was ranked as a strong performer, and forrester writes that “Cloudera Data Platform (CDP) supports a full data lifecycle ecosystem across hybrid and multiple clouds, delivering a shared data experience.” A move to the cloud should not create additional silos of data, requiring maintaining on-premises and multiple cloud security profiles, governance and metadata management services and more. Cloudera’s Shared Data Experience (SDX) seamlessly spans on-premises, private and multiple public cloud data warehouses, using a single security and governance framework across all. Add to this the ability to run the same workload on premise and in the cloud, using the same technology in all deployment forms, and you can take your journey to the cloud at your own pace.
Forrester also writes “Cloudera Data Warehouse is fully integrated with CDP to provide easy-to-use self-service and advanced analytics use cases at scale. It supports auto provisioning, cloud optimization, self-service workload management, and auto scaling capabilities.” These values are equally present in private cloud, based on openshift, or in public cloud, using kubernetes. Making data warehouse resources easy to access, deploy, use, and scale means that flexibility is assured in dealing with the everyday needs of the enterprise while also tackling the urgent demands of VIP analytics without impacting existing KPIs.
These values extend beyond just data warehousing, as Forrester writes “Cloudera’s shared data experience provides consistent data security, governance, and control across all multifunction analytics and data discovery.”, by Cloudera fully integrating data engineering, machine learning, data flow and streaming, data science together in the same platform as data warehousing, collaborating on the same data at the same time, with the same secure and governed environment, increases agility even further.
Forrester’s positioning Cloudera as a challenger in it’s first participation in their cloud data warehouse wave, we believe, makes us well positioned to take on the border challenges of hybrid cloud, and the full data lifecycle.