Using Big Data for Financial Fraud Prevention

by Jonathan Hassell

Posted in Business | August 28, 2018 3 min read

This post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be valid.

Fraud methods are becoming more sophisticated, forcing banks and other financial institutions to invest in financial fraud prevention solutions. Some of the latest techniques involve behavioral detection based on big data, and—if wielded skillfully—they can be a most effective weapon.

Fighting Credit Card Fraud

The most common use of big data to combat financial fraud is in the credit card industry. And it’s an interesting problem, because it requires balancing two conflicting priorities. On the one hand, companies want to catch as many instances of fraud as possible; on the other, they want their customers to have a hassle-free experience.

To try to catch all instances of fraud, a credit card company would have to stop every transaction and put them through a detailed examination before approving. This would of course create a terrible experience for cardholders. On the other hand, if the credit card company were to simply approve all transactions without checking for fraud, it would likely go out of business with the steady losses, and cardholders would be forced to look elsewhere. At the end of the day, it’s important to optimize for these competing demands and reach for a happy medium.

Companies that utilize big data can dramatically enhance their fraud detection. For example, credit card companies can use data analytics to compare the geographical locations of in-person card swipes with the amount of time elapsed between them, a method called geotiming. If two in-person card payments occur in different locations without enough time having elapsed for the customer to travel between them, the credit card company can automatically flag the activity as indicative of fraud. By including additional types of data in the model, companies can further improve fraud detection accuracy.

Detecting fraud involves understanding that every cardholder has a different pattern of usage, so you need to build fraud detection models specific to each cardholder. For instance, a frequent flyer will need a looser geographic “fence” for fraud detection than a cardholder who travels infrequently. Other data points—the types of businesses a person frequents, the amounts of monthly bills over time (accounting for seasonal bumps), financial resources that the model can verify—can all be useful in building a more intelligent model.

Detecting Internal Fraud

The second most common implementation of big data in financial fraud prevention is in detecting fraud within a company, usually initiated and monitored by a company’s compliance department. Typically, we see this in regulated financial entities that make investment decisions. The mechanism for detecting this kind of fraud is similar to that of credit card fraud models, but it is focused on the actions of internal employees. It involves gathering data on employee transactions, phone conversations, website visits, and other relevant work-related activities. Models are then built to define an acceptable pattern of behavior for a particular role. A key difference between this scenario and the payment card example is that these reviews generally have less of a time restriction: internal fraud detection often occurs within minutes or hours, whereas payment card fraud detection typically must occur in a matter of seconds.

Implementing a Big Data Strategy

Big data is a powerful tool for financial fraud prevention, but there are some obstacles a data-based approach can force you to have to face. Removing people from the process allows you to scale more effectively over a broader base, but it also removes the value of human judgment from the equation. Models need to be carefully reviewed to ensure you have appropriate checks and balances built in. One of the biggest problems that can occur is a faulty model that does things it isn’t supposed to do—because it will do them more quickly and on a much broader scale than a human being would.

In this environment, there are two very important things to keep in mind: First, more data will almost always equal better models. When you have more variables to work with, you can develop better algorithms to map the data. You also need to have enough computational power—because if you can run models more frequently, they can be tweaked more frequently. Massive amounts of data and processing power allow you to sift through data very quickly, get to better model results quickly, and apply those learnings to further model refinement.

For more information on how to leverage big data in the financial services industry to minimize risk, learn more about predictive analytics and solutions for financial services.

Jonathan Hassell

More by this author