AML: Past, Present and Future – Part III

AML: Past, Present and Future – Part III

This is the third installment in a 3 part series. The first installment provides a short background on anti-money laundering. The second installment examines common AML problems faced by financial institutions today. In this installment, we introduce an approach that carries AML well into the future.

Part III: The future is now

Given what we know about current anti-money laundering systems, if we wanted to build one from scratch today, we might come up with the following requirements. The system must:

  • Ingest, process, analyze, store, and serve all types of AML data, be it structured (database tables), unstructured (contracts, e-mails, etc.), or digitized (scans)
  • Handle increases in data volume gracefully
  • Represent entity relationships, to help determine ultimate beneficial owner, contribute to risk scoring, and facilitate investigations
  • Support machine learning (ML) algorithms and data science activities, to help with name matching, risk scoring, link analysis, anomaly detection, and transaction monitoring
  • Provide audit and data lineage information to facilitate regulatory reviews
  • Adapt to changes in the business and regulatory requirements by facilitating the provisioning of new data and modeling of new scenarios

With these in mind, Cloudera and Booz Allen Hamilton have joined forces to offer a next generation solution for AML. The solution combines Cloudera Enterprise, the scalable distributed platform for big data, machine learning, and analytics, with riskCanvas, the financial crime software suite from Booz Allen Hamilton.

rC and SDX

Cloudera Enterprise

The foundation of this end-to-end AML solution is Cloudera Enterprise. It supports a variety of storage engines that can handle raw files, structured data (tables), and unstructured data. It also supports a number of frameworks that can process data in parallel, in batch or in streams, in a variety of languages. SQL, Python, R, Java, and Scala are widely used in the platform. Storage and processing can scale to petabytes, which eliminates the need to offload data to a slower storage medium. Having fast online access to years of AML data helps with investigations and data science activities.

Cloudera Enterprise comes with a search engine to support indexing for fast access and entity resolution. Graph storage is possible with Apache HBase or Apache Kudu as storage engines, and Apache Spark for graph processing. Together, they enable analysis of entity relationships and networks. Spark also enables data science at scale. It can work in combination with Cloudera Data Science Workbench (CDSW)—a tool for data scientists—to run experiments, train ML models, enable collaboration, and deploy models to production. Cloudera Enterprise also ships with SDX, a unified metadata layer that provides a data catalog, security policies, access logs, and data lineage. SDX helps to facilitate governance over enterprise data in order to satisfy regulatory inquiries.

Cloudera Enterprise has been deployed to great success in a number of financial crime prevention and digital surveillance applications, including AML, fraud, trader surveillance, and cybersecurity. With Cloudera Enterprise, our customers have been able to:

  • Integrate AML data from dozens of bank branches globally, amounting to 10s of millions of transactions per day, while saving millions of dollars per year compared to scaling the legacy implementation
  • Reduce transaction monitoring false positive rates by half using machine learning models against big data containing tens of thousands of features
  • Uncover millions of dollars worth of previously undetected instances of fraud and financial crime, by analyzing petabytes of transaction history

In years past, financial institutions would incrementally integrate the Cloudera Enterprise data platform into various aspects of their existing legacy AML or fraud prevention systems. These tended to be large banks with complex requirements, and a technology staff that allowed them to assemble a big data AML system tailored to their specific needs. They brought in Cloudera Enterprise to address an immediate problem at hand: for example, to handle an uptick in transaction volume, or to reduce the cost of archiving historical data. They would also bring in and integrate complementary software for data integration, visualization, and analytics, that ran on top of the Cloudera stack. All the AML-specific logic to support KYC, transaction monitoring, investigations, case management, and analytics, were developed in-house.


The riskCanvas application suite presents an alternative to in-house developed big data AML solutions. It is an end-to-end AML and financial crime prevention solution built for big data and machine learning, that leverages the full capabilities of the Cloudera Enterprise data platform. It features modules for analytics, customer due diligence, transaction monitoring, and investigation management, all woven together by Tesseract, a data integration engine built into the riskCanvas solution.

Part3 image

riskCanvas follows a modular architecture. Modules maybe deployed independently, but work best when deployed together to full effect. riskCanvas modules consist of the following:

riskCanvas Analytics Suite

  • Suite of open source analytics tools integrated with riskCanvas platform to provide advanced analytics, machine learning, prototyping, dashboarding, automated risk assessments and more.

riskCanvas Entity Analytics

  • Entity Resolution and Data Enrichment
  • Entity Risk Scoring
  • Network Analysis

riskCanvas Transaction Surveillance

  • Real-Time Detection and Rules-Based Monitoring
  • Behavioral Anomaly Detection
  • Supervised and Unsupervised Machine Learning

riskCanvas Investigation Management

  • Case Management and Workflow
  • Investigation Acceleration
  • Single View of the Customer

riskCanvas Tesseract

  • Dynamic data ingest and processing system for AML data

Booz Allen Hamilton riskCanvas is a built-for-purpose financial crime software suite that harnesses the power of big data and machine learning provided by Cloudera Enterprise. With riskCanvas, regulated institutions can deploy a next-generation AML system, integrated with source data systems, in a matter of a few months. Within a relatively short span of time, you can start to realize benefits in the form of improved false positive rates, accelerated investigations, and lower regulatory risk.

To learn more about Next-Gen AML, please visit the Cloudera Solutions Gallery.
Come visit Cloudera booth at Strata Data NY, where we will feature a demo of the solution and a presentation by Booz Allen Hamilton experts.

Patrick Angeles
More by this author

Leave a comment

Your email address will not be published. Links are not permitted in comments.