Episode 12

AI generated documents, should we be concerned?

A podcast about fake receipts, autoregressive image models, and just how much of a threat AI-generated documents are.

In this episode of Good Question, we explore the growing risk of AI-generated documents with Ronan Burke, CEO and co-founder of Inscribe, and Martina Pugliese, Senior Research Scientist at Inscribe. 

Hosted by Brianna Valleskey, the episode breaks down the technical mechanics of how generative image models create convincing fake documents, as well as what risk teams can do about it.

From synthetic receipts to near-realistic invoices, the group explores what’s real, what’s hype, and what’s coming next in the arms race between generative AI and fraud detection.

A new class of document fraud

Recent updates to ChatGPT’s image generation capabilities have introduced powerful new tools for fraudsters. Using an autoregressive image model (as opposed to diffusion-based approaches), these systems can now generate entire documents from scratch (pixel by pixel) with highly realistic formatting and texture.

Martina and Ronan unpack how these models work, what they’re good at (like restaurant receipts and invoices), and where they still fall short (like IDs and complex bank statements). While today’s fakes are still flawed, they’re improving fast enough to pass through automated workflows without triggering suspicion.

“The assumptions that a photo equals proof are no longer valid. That’s why this new wave of synthetic media is so important to address,” Ronan said.

What makes these fakes detectable?

Despite their visual appeal, AI-generated documents often contain subtle but revealing flaws:

  • Incorrect payment methods (e.g., “VSA” instead of “VISA”)

  • Mismatched regions or currency formatting (e.g., using U.S. decimals in Euro pricing)

  • Nonexistent store locations or business addresses

  • Unnatural lighting, text rendering, or background hues

  • Inconsistent date ranges, semantic errors, or fake URLs

Martina explains how these anomalies leave behind a forensic trail; one that Inscribe’s detectors can catch. She also shares how the team builds synthetic datasets to stress-test detection models and train new classifiers that specifically target AI-generated content.

Forensic systems meet generative AI

At Inscribe, AI-generated documents aren’t just a new threat. They’re a new opportunity for innovation in fraud detection. Here’s how our systems are evolving to meet the moment:

  • Forensic detectors analyze pixel-level artifacts, document layering, and signature image attributes

  • Semantic LLMs validate logical consistency, detect out-of-place words, and assess vocabulary plausibility

  • Network-level analysis cross-references document traits against millions of records in our ecosystem

  • Metadata validation confirms timestamps, file origins, and expected structure

  • Synthetic dataset training helps continuously improve our classifiers against evolving generative techniques

What’s next for document trust and fraud detection?

The episode concludes with a forward-looking discussion on where the industry is headed:

  • Cryptographic proof of authenticity may become standard (i.e., digital signatures or certificates from trusted issuers)

  • AI agents will play a growing role in verifying trust, especially as documents become harder to assess with the naked eye

  • Human fraud analysts will partner with AI systems to focus on edge cases, while agents handle routine detection and research.

Ronan shares why he’s optimistic: Not because the risk isn’t real, but because the tools to address it are evolving just as fast.

“This is a new class of risk, but it’s a solvable one,” he said. “With the right systems, AI-generated fraud can be detected, flagged, and stopped before it causes harm.”

AI-generated document examples

The episode includes a walkthrough of three fake documents generated by Inscribe’s internal team:

  • A synthetic Apple Store receipt that visually mimics a real purchase but contains subtle red flags in math, location, and formatting

  • A highly polished invoice that appears credible but lacks environmental realism and carries suspiciously uniform formatting

  • A bank statement with altered names, mismatched dates, and fabricated locations — showing how LLMs can be used to synthesize realistic transaction lists.

Each document was successfully flagged by Inscribe’s internal AI-generated content detectors.

About the guests

Ronan Burke is the co-founder and CEO of Inscribe. He founded Inscribe with his twin after they experienced the challenges of manual review operations and over-burdened risk teams at national banks and fast-growing fintechs. So they set out to alleviate those challenges by deploying safe, scalable, and reliable AI.

Martina Pugliese, PhD, is a Senior Research Scientist at Inscribe AI, where she applies her expertise in computer vision, artificial intelligence, and software engineering to solve complex problems. Martina is an experienced data scientist with a strong foundation in physics research, specializing in data analysis, statistical modeling, and machine learning. She is also deeply involved in the tech community, mentoring data professionals, creating data visualizations, and organizing events. Driven by curiosity and a passion for learning, she is committed to knowledge sharing and continuous growth.

Brianna Valleskey is the head of marketing at Inscribe AI. While her career started in journalism, she has spent more than a decade working on SaaS revenue teams. She is passionate about enabling fraud fighters and risk leaders to unlock the enormous potential of AI.

No items found.