Cloudera Data Science Workbench Archives - Page 3 of 10

April 19, 2021 | Technical

Deep Learning with Nvidia GPUs in Cloudera Machine Learning

Introduction In our previous blog post in this series, we explored the benefits of using GPUs for data science workflows, and demonstrated how to set up sessions in Cloudera Machine Learning (CML) to access NVIDIA GPUs for accelerating Machine Learning Projects. While the time-saving potential of using GPUs for complex and large tasks is massive, […]

by Brian Law 5 min read

April 10, 2021 | Business

Enabling NVIDIA GPUs to accelerate model development in Cloudera Machine Learning

When working on complex, or rigorous enterprise machine learning projects, Data Scientists and Machine Learning Engineers experience various degrees of processing lag training models at scale. While model training on small data can typically take minutes, doing the same on large volumes of data can take hours or even weeks. To overcome this, practitioners often […]

by Peter Ableda 4 min read

Data Science Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Machine Learning Data Science Governance Machine Learning Modernize Architecture Performance

February 25, 2021 | Business

Change The Way You Do ML With Applied ML Prototypes

Today’s enterprise data science teams have one of the most challenging, yet most important roles to play in your business’s ML strategy. In our current landscape, businesses that have adopted a successful ML strategy are outperforming their competitors by over 9%. The implications of ML on the future of business are clear. However, only 4% […]

by Cloudera , Santiago Giraldo 6 min read

Business Analytics Data Science Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Fast Forward Labs Research Machine Learning Data Science Machine Learning Modernize Architecture

January 20, 2021 | Technical

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 3: Productionization of ML models

In this last installment, we’ll discuss a demo application that uses PySpark.ML to make a classification model based off of training data stored in both Cloudera’s Operational Database (powered by Apache HBase) and Apache HDFS. Afterwards, this model is then scored and served through a simple Web Application. For more context, this demo is based […]

by Manas Chakka 5 min read

Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Machine Learning Operational DB Machine Learning Ops and DevOps

January 13, 2021 | Business

2020 Data Impact Award Winner Spotlight: United Overseas Bank

2020 was a year of immense change and disruption. Despite the challenges, 2020 also provided positive opportunities for forward leaps to be made in the realm of digital transformation. At Cloudera, an example of this leap is our first virtual Data Impact Awards, which was held in November last year. One of our stand out […]

by Arielle Diamond 2 min read

Apache Hive Apache Impala Business Analytics Cloudera Data Platform (CDP) Cloudera Data Science Workbench Data Warehouse Financial Services Customer Analytics

January 13, 2021 | Technical

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 2: Querying/ Loading Data

In this installment, we’ll discuss how to do Get/Scan Operations and utilize PySpark SQL. Afterward, we’ll talk about Bulk Operations and then some troubleshooting errors you may come across while trying this yourself. Read the first blog here. Get/Scan Operations Using Catalogs In this example, let’s load the table ‘tblEmployee’ that we made in the […]

by Manas Chakka 5 min read

Apache Spark Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Machine Learning Operational DB Machine Learning Modernize Architecture

January 6, 2021 | Business

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 1: The Set-Up & Basics

Introduction Python is used extensively among Data Engineers and Data Scientists to solve all sorts of problems from ETL/ELT pipelines to building machine learning models. Apache HBase is an effective data storage system for many workflows but accessing this data specifically through Python can be a struggle. For data professionals that want to make use […]

by Manas Chakka 5 min read

Apache HBase Data Science Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Machine Learning Data Science Machine Learning

December 21, 2020 | Technical

An A-Z Data Adventure on Cloudera’s Data Platform

In this blog we will take you through a persona-based data adventure, with short demos attached, to show you the A-Z data worker workflow expedited and made easier through self-service, seamless integration, and cloud-native technologies. You will learn all the parts of Cloudera’s Data Platform that together will accelerate your everyday Data Worker tasks. This […]

by Eva Nahari , Balazs Gaspar , Jon Ingalls , Karthik Krishnamoorthy , Shaun Ahmadian 8 min read

Business Analytics Cloud Data Science Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Data Engineering Data Hub Data Warehouse Fast Forward Labs Research Machine Learning SDX Technologies Education Energy & Utilities Financial Services Healthcare & Life Sciences Insurance Manufacturing & Automotive Public Sector Retail, Ecommerce & Consumer Products Technology Telecommunications Customer Analytics Data Science Governance Machine Learning Modernize Architecture

December 11, 2020 | Business

Covid Data: An anomalous blip, or the new normal?

COVID-19 has forced virtually every industry to embrace an acceleration in digital capabilities. While it can be argued that digital transformation was already underway; it’s hard to dispute that it has accelerated in recent months. A recent McKinsey survey, cited in CRN, shows that worldwide, 58 percent of customer interactions were digital as of July […]

by Sandra Horn 3 min read

Data 360 Machine Learning Cloudera Data Platform (CDP) Cloudera Data Science Workbench Data Engineering Financial Services Insurance Data Ingestion Machine Learning Security, Risk, & Compliance

December 7, 2020 | Technical

Global View Distributed File System with Mount Points

Apache Hadoop Distributed File System (HDFS) is the most popular file system in the big data world. The Apache Hadoop File System interface has provided integration to many other popular storage systems like Apache Ozone, S3, Azure Data Lake Storage etc. Some HDFS users want to extend the HDFS Namenode capacity by configuring Federation of […]

by Uma Maheswara Rao Gangumalla 9 min read

Apache Hadoop Apache Ozone S3Guard Cloudera Data Platform (CDP) Cloudera Data Science Workbench Data Engineering Data Warehouse Modernize Architecture

Filter By