When you hear that large tech enterprises like PayPal have discovered millions of fraudulent accounts within their customer base, you might think that it’s their problem to deal with. Or maybe even that it’s their fault for not preventing it.
But the truth is that fraud affects everybody. Not only do companies have to foot the bill for fraud losses, but that cost is often passed on to consumers of their product or service (all of us).
And fraudsters are always evolving — leading to an increase in the quantity and sophistication of attacks: According to SIFT, payment fraud attacks against fintech companies increased by 70% in 2021 alone. In addition to the cost of fraud that is passed on to consumers, it’s also a tax on economic progress. It’s money that is diverted away from good or productive causes and instead put towards criminal uses.
So this is not any single company’s problem, but an industry-wide issue.
Given the gravity of this problem, my co-founder and I were genuinely surprised to learn that even the most innovative fintechs still relied on manual document reviews to assess fraud and credit risk. But it didn’t take us long to figure out why: document fraud is difficult to solve.
The best risk and operations teams are armed with experience and intuition. They’ve seen thousands of documents and can identify when something isn’t right. But in an increasingly digital, geographically diverse, and faster world, it’s becoming more difficult for those experts to maintain a working knowledge of all the various documents coming into their queues.
And even the most dedicated risk management professionals can’t identify fraud that’s invisible to the human eye (thanks to the widespread adoption of photoshop technology used by fraudsters).
These frontline fraud fighters must be better equipped to face the challenges of today.
Though my co-founder — who is also my twin brother — and I hadn’t spent our careers in the fintech industry, it was perhaps our engineering mindsets that set us to the task: We see a problem and can’t help but immediately wonder if it’s possible to build something that will help.
But first, we needed to gain a deeper understanding.
When you're trying to create detection capabilities for a new fraud vector, you have to spend a lot of time talking to people on the ground to understand what's actually fraudulent or not.
Other fraud companies often have the luxury of closer parallels and comparables to their own fraud detection that they can use as a way of generating label data. In our kind of case, however, we needed to understand what heuristics our customers were currently using to detect document fraud.
We’d seen fraud detection and document automation companies in our space try to build a perfect solution right out of the gate without talking to customers — but they had since shut down. They weren’t able to get over the cold start problem; they weren’t able to build a product from the ground up because they didn’t have access to the data their customers were using.
This comes back to the first rule of machine learning: Start with data, not machine learning. If you don't have a good dataset, you're wasting your time. You'll end up either choosing the wrong model or training a model on data that won’t perform the way that you expect.
So we needed access to real-world data. But how? We couldn’t purchase it online or come up with it ourselves, so we decided to take a partnership approach with our customers. Not by saying that we solved the problem — but that we thought we could solve the problem. And in this way, we built trust. That allowed us to access their data and build a solution together.
It’s also how we learned that most document fraud is difficult to identify or even invisible to the human eye. Take a fraudster who edits their pay stub to inflate their salary in hopes of getting a larger line of credit: They might use a Sans-Serif font for the edited text instead of the Times New Roman font that the rest of the document has. But even a highly trained human reviewer could miss this sometimes if they’re distracted, tired, or haven’t seen this particular document before.
Machines don’t get tired or distracted, so we knew they’d be able to help. We started labeling the data from our customers and training our machine learning models to mimic the heuristics of a human review team.
But getting access to our customers' data and learning what heuristics their manual review teams were already using wasn’t enough. Simple heuristics weren’t enough.
We had built strong relationships with these customers, and we were committed to continually improving the product so that it best served their needs. We knew that we could use AI and machine learning to identify more fraud, automate more parts of the manual review process, and truly delight our customers.
Many of our customers told us they wanted to expand their own customer base, but that would mean accepting more new types of documents that their teams weren’t familiar with. And by now, we knew that unique or unfamiliar document types are one way that fraudsters can successfully slip through the cracks.
But risk and operations teams don’t need to rely on their team members to memorize every type of document in the world — not when they have the help of machines, which have an almost super intelligence and the ability to remember everything.
Machine learning models can very effectively codify what looks suspicious, what looks legitimate, and what needs further review. As we’ve gotten more data and our customers have gotten more diverse (with more document types), we’ve already seen a massive uptick in terms of what a machine can do versus a human reviewer alone.
And we’re really only just getting started: We’re applying increasingly sophisticated models on larger datasets. We’re making collecting verified data easier and helping fintechs analyze the credit patterns of their customers. We’re discovering the latest trends in fraud. And we’re even helping our customers create better customer experiences for their customers with more efficient KYC/KYB and underwriting workflows.
But we’re only doing those things because we’ve continued to stay close to our customers and really focus on empathizing with their pain points. It’s very easy to get caught up in your own assumptions and what you think is right when building a product, especially when you have strong convictions (as you’ve likely noticed that we do). However, one thing I believe strongly, and we believe internally at Inscribe, is that we can have our own opinions and thoughts about something — as long as we always go back to our customers as our source of truth.
One of the biggest challenges with AI and machine learning is explaining the results to humans. So we’ve rigorously kept feedback loops open to understanding what fraud results and other aspects of our solution make sense to our customers and what needs more explanation.
Explainability in financial services, especially, is essential. Not only do we want to protect the rights of consumers and businesses, but we also want to be able to offer a product that our customers could easily understand and use to provide great experiences for their own customers.
So understanding why exactly we flagged a document as fraudulent is important. In the case where an application is rejected, our customer can go back to their end user and say, “X, Y, and Z happened.”
But we also prioritize explainability for less specialized reasons. Our customers need to be able to stand behind each decision they make based on the data we give them. They need to be able to understand that the data that we trained our models on is representative of their data; they need to be able to understand what features we are using to detect fraud; and they need to be able to audit it.
Because we started with a very heuristic approach to machine learning modeling, we've been able to always maintain a human level of understanding.
Anytime we talk to a customer and they say, “I used to do X-Y-Z,” we really tried to keep a lot of those building blocks in our eventual products and not hide behind just one big black box.
That's really helped us maintain that level of explainability for our customers. And as we've grown, we've been able to avoid this case where it's just one big black box and you give us a customer application and then we say yes or no at the end. We try to keep as many of our models separate and give an individual output to our end users.
As engineers, we tend to think that if we build a good product, customers will come to us. But that’s not the case.
Talking to customers has always been a part of our DNA, even from our early days at Y Combinator. That program really instilled the idea of focusing on what you can control: talking to customers, building product.
So we’ll continue to put customer empathy at the heart of everything we do. We know that manual reviewers have some interesting insights that our machines might be missing, and we want to hear about them. We want to get that feedback into our system.
I’ve already had a few customer calls today, and then I’m having pizza with a few customers later. We find that building relationships with our customers is an extremely useful business activity not only because it helps us build the best AI-powered fraud detection on the market but because we can ensure that we’re solving their problems and providing value.