Every fraud team has a version of this story: a spike in suspicious traffic, a model threshold nudged down, and a flurry of declines. The dashboard goes green. Leadership nods approvingly. The fraud numbers look clean.
What doesn’t show up in that report is what happened next:
- The marketing VP for a regional software company, logging in from her company VPN, is locked out of her account during a vendor renewal.
- The small business owner, completing a payment at checkout, declined with no explanation and decided to buy from a competitor instead.
- The enterprise customer who didn’t complain — instead, they just left.
Call this decision friction: the compounding cost of false positives on the customer experience and the revenue line.
Unlike fraud losses, it doesn’t surface in incident reports. It shows up in conversion data, churn metrics, and support queues. It’s attributed to a dozen causes but rarely traced back to the security layer that caused it.
The instinct when false-positive rates climb is to recalibrate the model.
The real problem is what the model is working with.
The Environment Has Changed. The Detection Stack Hasn’t.
Three structural dynamics are compressing the signal quality that automated systems depend on. And making the same traffic look very different from what it did five years ago.
- Shared infrastructure is now the norm. Enterprise networks often route tens of thousands of employees through a small pool of public IP addresses. Carrier-grade NAT (CGNAT) extends this model to ISP’s where large subscriber bases are multiplexed over limited IPv4 or IPv6 space. As a result, a single IP address may correspond to one user at a given moment or represent tens of thousands of distinct users over time. Without additional context, a detection system cannot reliably distinguish between these cases.
- Residential proxy networks have evolved into sophisticated fraud infrastructure. Unlike datacenter proxies, residential IPs appear legitimate because they originate from real consumer devices connected through bona fide ISP subscriptions. As Google’s disruption of the IPIDEA network illustrated, these networks can reach massive scale, often by enrolling consumer devices without their owners’ meaningful awareness. The result is a fraud infrastructure that, at the IP level, looks indistinguishable from legitimate household traffic.
- Detection stacks are still treating the presence of a proxy as a verdict. A proxy flag is context, not a conclusion. Blanket blocking of VPN or proxy traffic (without understanding what that traffic represents) alienates legitimate users while sophisticated attackers pivot to harder-to-detect infrastructure.
This last point adds a dimension that most fraud teams don’t explicitly discuss: the false positive problem is partly manufactured by adversaries. Sophisticated attackers deliberately route through residential proxies and shared infrastructure precisely because of the effect it produces — detection systems either fail to catch the attack, or they catch it by blocking thousands of legitimate users alongside it. The malicious traffic hides inside legitimate-looking signals by design. Blocking it requires collateral damage. That’s not a flaw in the attacker’s approach. It’s the strategy.
That asymmetry is not incidental. It is the strategy.
The Automation Amplification Problem
Here’s the argument that most analyses of false positives miss: in automated systems, a bad signal doesn’t produce just one bad decision. It can produce millions of them, at machine speed, before anyone notices.
Consider this example. An automated decisioning system processing 50,000 transactions per hour (not unusual at enterprise volume) with a 3% false positive rate is not making 1,500 mistakes. It is making 1,500 mistakes per hour, continuously, against customers who have done nothing wrong. That’s 36,000 legitimate users incorrectly blocked every 24 hours. The numbers are illustrative. The dynamic is not.
The scale changes the nature of the problem. A 3% false-positive rate is not a tuning problem to iterate on. At those volumes, it is a structural failure running in production. And the customers on the receiving end don’t know that. They see a decline, a lockout, or a friction event. And then they decide whether to try again or take their business elsewhere.
What would have changed that outcome is not a better-calibrated model. It’s a better signal feeding the model in the first place.
The same traffic, read differently.
Consider two scenarios.
The first: a corporate employee connects to their enterprise VPN gateway in Chicago, authenticates, and initiates a software purchase. The IP is flagged — high-activity, shared infrastructure, VPN detected. Risk score elevated. The transaction is stepped up or declined.
The second: a fraud ring testing stolen credentials rotates through residential IPs across the same Chicago metro area, each appearing to originate from a different household. The IPs are clean. No proxy checks in place. Risk score remains within normal thresholds. Transactions proceed.
The detection layer’s failure in both cases is the same: it read the surface signal without reading the infrastructure context.
This is how account takeover campaigns operate in practice. Attackers using residential proxy networks don’t look like attackers at the IP level — they look like normal residential traffic, distributed across geographies, with no obvious clustering. The signal that distinguishes them from legitimate users isn’t the IP itself. It’s the behavioral and infrastructure context underneath it: persistence patterns, IP stability, device density, and the range of locations tied to a single session sequence.
Without that context, the detection layer is left making a surface-level call. The attacker’s session and the legitimate user’s next login can look nearly identical. One proceeds. One gets stepped up or blocked. The wrong one, often enough to matter.
The Business Impact Is Hiding in Plain Sight
Industry data on false-positive costs are striking and largely absent from conversations in security operations centers.
- False declines cost retailers an estimated $443 billion globally per year — roughly nine times more than actual fraud losses. (Aite-Novarica, via Riskified)
- In 2023, false declines in the United States alone jeopardized a staggering $157 billion, leading to an estimated ultimate loss of $81 billion. (PYMNTS Intelligence)
- Most merchants report false positive rates of between 2% and 10% of total eCommerce orders. (Merchant Risk Council, 2024 Global eCommerce Payments & Fraud Report, via Fiserv)
- 41% of consumers globally say they’ll never shop with a brand after a false decline. (ClearSale, State of Consumer Attitudes on Ecommerce, Fraud & CX 2023–2024)
Four cost categories compound that headline number:
Customer friction. Legitimate users locked out, stepped up, or declined. They don’t file support tickets at any meaningful rate. They leave.
Conversion drag. Every friction event at checkout introduces abandonment risk. The transaction cost is immediate and visible. The relationship cost (the customer who decides not to come back) takes months to appear in retention data and is rarely attributed to the fraud layer.
Analyst load. Inconclusive automated decisions get routed to human review teams. At enterprise volume, manual review teams handle 1,000–5,000 orders per day. That is security talent doing low-value triage instead of higher-order threat analysis.
The attribution gap. The downstream revenue loss from false positives rarely surfaces in fraud reporting. When a customer doesn’t return, that churn registers in retention dashboards or product analytics — not in fraud operations. No one connects the revenue leak to the detection decision that caused it, which means no one is accountable for fixing it.
Taken together, this is the cumulative tax the security infrastructure levies on the organization’s customers, largely without anyone’s knowledge and with no one formally responsible for stopping it.
The Organizational Accountability Gap
In most enterprises, fraud teams are measured on fraud loss: detected incidents, chargeback rates, and dollar amounts prevented. They are not measured on approval rates, conversion rates, or customer friction caused.
The organizations that experience the cost of false positives are not the organizations that control the detection signals. The team generating the friction is not the team being measured on it. This is not a competence problem. It is a structural one.
The result is that the false positive problem remains invisible at the leadership level until it’s large enough to show up in revenue figures.
At which point the conversation can stray away from analytical. Fraud and growth teams fighting over approval rate thresholds is a symptom of this misalignment, not the cause. The cause is that no one formally owns the friction cost, and there is no organizational incentive to reduce it.
The Feedback Loop Problem
There is a longer-term consequence that technical executives will recognize, and most business-level analysis ignores: fraud models that run on imprecise signals don’t just produce false positives. They degrade over time.
When legitimate users are incorrectly blocked or escalated, they don’t always retry through the same channel. They call support. They use a different device. They abandon and don’t return. This means the model never receives a corrected signal on what a good outcome looked like for that session. The feedback loop that should improve detection accuracy over time is broken at the source.
Meanwhile, the fraud patterns the model was trained on evolve. When disrupted, attacker infrastructure doesn’t just disappear — it mutates, reappearing through new IP addresses, devices, and networks. The FBI’s takedown of the Volt Typhoon botnet illustrated this directly: the network rebuilt itself after the disruption rather than dissolving.
Legitimate traffic patterns shift. Without infrastructure-level context to anchor the signal, the model becomes progressively less able to distinguish good traffic from bad — not because the attackers got smarter, but because the training signal was compromised from the beginning.
This is a well-understood failure mode in automated fraud detection — and it applies whether the decisioning layer is model-driven, rules-based, or a hybrid of both. Its absence from most business-level discussions of false positives is a gap that any technical executive will notice.
Reducing Friction Without Reducing Protection
The answer is not less automation. It is better if inputs are fed into that automation.
The distinction between surface IP address signals and infrastructure-level intelligence matters here. Basic signals, such as location, a proxy flag, and a generic risk score, tell the detection layer what an IP address is. Infrastructure signals, like IP stability, device density, behavioral persistence, proxy architecture type, and provider intent signals, tell it what the traffic represents.
That distinction produces different decisions. Infrastructure context enables confident approvals on ambiguous-but-legitimate traffic: the corporate VPN user, the privacy-conscious consumer, the remote worker on a shared gateway. It also enables targeted, proportionate scrutiny on activity that actually warrants it: the credential-stuffing ring cycling through residential proxies, the account takeover campaign hiding behind clean-looking consumer IPs, and the bot network rotating identities at scale.
Digital Element’s approach to this problem is built around a specific data set: IP Characteristics (IPC), which maps the infrastructure context around an IP address rather than treating the address itself as the signal. Instead of asking ‘where is this IP?’ it asks ‘what does this IP’s behavior tell us about the traffic behind it?’ That produces four measurable dimensions:
- Activity (device density per IP)
- Location (geolocation consistency)
- Range (distance between observed locations over time)
- Persistence (how long an IP remains tied to a location)
Together, these dimensions can distinguish the Chicago corporate VPN from the residential proxy attack, even when both originate from the same metro area.
The business outcome of better inputs is not just fewer bad decisions. It has fewer manual reviews, lower analyst load, and a fraud layer that stops levying an invisible tax on the customers it is supposed to help protect.
The Takeaway
The false positive problem is a data problem. As attacker infrastructure grows indistinguishable from legitimate consumer traffic, and automated systems scale decisions to machine speed, organizations running on basic IP signals are not just accepting higher false positive rates. They are systematically transferring revenue from their own customers to their competitors, one friction event at a time, at volumes that don’t appear in any incident report.
The path forward is not recalibrating the model. It is re-examining what the model is working with.
The most effective fraud defenses don’t just detect risk. They understand it.