Big Data Analytics and Better Modeling Are Changing the Mortgage Industry

by Jonathan Hassell

Posted in Business | January 03, 2018 4 min read

This post was published on Hortonworks.com before the merger with Cloudera. Some links, resources, or references may no longer be valid.

Anyone who’s bought a house without paying the full price upfront knows that applying for a mortgage is anything but simple and straightforward. Even the customers with the most solid financial footings go through lengthy and inconvenient processes to borrow for a home. However, financial institutions and other money services businesses are starting to investigate using big data analytics to cut down on the red tape in mortgage applications and make it easier for their customers to apply for loans.

How Big Data Changes Mortgage Applications

Mortgage banking has long been about modeling loan performance appropriately, based on a financial institution’s business goals, appetite for and ability to take risks, and the overall economy. Risk departments have used models to drive decisions about how many loans should be issued, what types of pricing and mitigation should be applied to individual loans, and what demographics and segments of the economy represent good loans for a bank or credit union.

More Data, Greater Assessments

The key difference with the emergence of big data analytics is the amount of data at companies’ disposal. Models with more data have gotten much better at assessing risk and providing an accurate idea of the behavior of a portfolio of mortgages. These models succeed because they use a much broader base of data types—including factors such as a prospective customer’s geolocation and transaction history—and are able to correlate things like comparable people who behave in a particular manner.

Data is also more available. The amount of data an organization can collect on one person is vaster than what used to be possible and, once assembled, the data can build a dramatically different picture. Data sets consisting of anything from geolocation to credit card transactions and store sales are starting to be available for sale to lending institutions and other outfits, and that data can be correlated with existing models to generate better decisions.

The types of mathematical models being used today have gotten better in recent years because more data input equals better tweaking and adjusting, and exponentially better training over much larger amounts of data. Simply put, there are more types of input, and that leads to better decision-making.

Data Leads to Faster Decision-Making

The speed of mortgage decisions is also improving, thanks to several factors. Today’s big data–oriented infrastructure is more robust and generally runs in near-real time. Most big data implementations have infrastructure that allows a lot of data—of different types or different functions—to be stored very effectively and very cheaply. Additionally, this infrastructure allows easy access to historical data in order to tweak models and validate them, providing analysts the ability to run models effectively and very quickly.

Key to these new capabilities is a computational engine that allows models to run in parallel and very fast, along with a streaming capability to be able to run a model on the fly as new data is coming in.

Potential Pitfalls to Be Aware Of

Ultimately, the idea behind drawing conclusions from big data is to build models and automation that drive better business decisions. But there are pitfalls to watch out for.

Losing the Human Touch

From a lending standpoint, as these models get better, the decision on whether someone should be approved can be driven more by the computer and less by a human decision. The removal of the human element in decision-making has advantages and disadvantages.

If a model is making a choice, it’s typically only programmed to examine risk factors, therefore leaving it blind to other factors a human might consider. On the other hand, a human’s decision can be influenced by biases extraneous to a purely risk-based decision. Removing biases can be an excellent attribute, but you also remove the ability to make judgment calls, which can mean making a borderline good loan go bad or vice versa.

Thus, a good model will always have two quality assurance steps: additional safety checks to ensure you don’t come up with the wrong answer because of a faulty model, and a review of decisions so that a human can assess the overall picture and either sustain or override the decision.

Getting Overwhelmed by Increased Loan Volumes

Another risk of utilizing data analytics as part of your mortgage and lending efforts is volume. With better models and decision-making, an organization can acquire an unwieldy appetite for loans. When you increase volume, however, risk can expand quickly, and the negative consequences can occur much more quickly over a larger base of people.

Being Outsmarted by Fraud

Finally, there’s a risk in how intelligent and tolerant models can be in driving lending decisions. In particular, fraud prevention needs to be factored in early on. Models need to be smart enough to prevent gaming. If nefarious actors know a model works with, say, three factors, they may then attempt to force those factors to drive a desired result. Modelers and analysts must understand this important issue and ensure that their models are smart enough to combat it.

Ultimately, the use of big data analytics for mortgages drives better institutional profitability—more loans that are closed in less time, and that perform better overall and have a reduced risk of loss. As long as you’re mindful of the common trouble areas, you can capitalize on data’s value and see the results you’re looking for.

For more examples of how modeling data allows businesses in the financial sector to increase their profitability, check out this piece on data insights and the mortgage industry.

Jonathan Hassell

More by this author

Editor's Choice

Business

Generative AI for the Enterprise

Technical

Building Trust in Public Sector AI Starts with Trusting Your Data