SafetyKit(opens in a new window) builds multimodal AI agents to help marketplaces, payment platforms, and fintechs detect and act on fraud and prohibited activity across text, images, financial transactions, product listings, and more. Recent breakthroughs in model reasoning and multimodal understanding now make this more effective, setting a new bar for risk, compliance, and safety operations.
SafetyKit’s agents leverage GPT‑5, GPT‑4.1, deep research, and Computer Using Agent (CUA) to review 100% of customer content with over 95% accuracy based on SafetyKit’s evals. They can help platforms protect users, prevent fraud, avoid regulatory fines and enforce complex policies that legacy systems may miss, such as region-specific rules, embedded phone numbers in scam images, or explicit content. Automation can also protect human moderators from exposure to offensive material and frees them to handle nuanced policy decisions.
David Graunke, Founder and CEO of SafetyKit
Design agents for what the task demands, then choose the right model
SafetyKit’s agents are each built to handle a specific risk category, from scams to illegal products. Every piece of content is routed to the agent best suited for that violation, using the optimal OpenAI model:
- GPT‑5 applies multimodal reasoning across text, images, and UI to surface hidden risks and support layered, precise decision-making
- GPT‑4.1 reliably follows detailed content-policy instructions and efficiently manages high-volume moderation workflows
- Reinforcement fine-tuning (RFT) boosts recall and precision beyond default models, achieving frontier performance with complex safety policies
- Deep research integrates real-time online investigation into merchant reviews and verifications
- Computer Using Agent (CUA) automates complex policy tasks, reducing reliance on costly manual reviews
This model-matching approach lets SafetyKit scale content review across modalities with more nuance and accuracy than legacy solutions can.
The Scam Detection agent, for example, goes beyond just scanning text. It analyzes visuals like QR codes or phone numbers embedded in product images. GPT‑4.1 helps it parse the image, understand the layout, and decide whether it is a policy violation.
The Policy Disclosure agent checks listings or landing pages for required language, such as legal disclaimers or region-specific compliance warnings. GPT‑4.1 extracts relevant sections, GPT‑5 evaluates compliance, and the agent flags violations.
“We think of our agents as purpose-built workflows,” says Graunke. “Some tasks require deep reasoning, others need multimodal context. OpenAI is the only stack that delivers reliable performance across both.”
Policy decisions often hinge on subtle distinctions. Take a marketplace requiring sellers to include a disclaimer for wellness products, with requirements varying based on product claims and regional rules. Legacy providers use keyword triggers or rigid rulesets, which can miss the deeper judgment calls these decisions may require, leading to missed or incorrect enforcement.
SafetyKit’s Policy Disclosure agent first references policies from SafetyKit’s internal library then GPT‑5 evaluates the content: does it mention treatment or prevention? Is it being sold in a region where disclosure is mandatory? And if so, is the required language actually included in the listing? If anything falls short, GPT‑5 returns a structured output the agent uses to flag the issue.
“The power of GPT‑5 is in how precisely it can reason when grounded in real policy,” notes Graunke. “It lets us make accurate, defensible decisions even in the edge cases where other systems fail.”
Turn every model release into a product win
SafetyKit benchmarks each new OpenAI model against its hardest cases, often deploying top performers the same day. Rigorous internal evaluations allow the team to quickly identify how new models can improve performance and seamlessly integrate into their core infrastructure.
When OpenAI o3 launched, SafetyKit used it to boost edge case performance across key policy areas. GPT‑5 followed, and within days, it was deployed across their most demanding agents, improving benchmark scores by more than 10 points on their toughest vision tasks.
“OpenAI moves fast, and we’ve designed our system to keep up. Every new release gives us an operational edge–unlocking new capabilities and domains we couldn’t support before, and increasing the coverage and accuracy we deliver to customers,” says Graunke.
SafetyKit also feeds improvements back into the ecosystem, sharing eval results, edge case failures, and policy-specific insights directly with OpenAI to help shape future model performance for safety-critical workloads.
Scale customer and volume growth with the best OpenAI stack
SafetyKit’s architecture enforces policy at scale, delivering speed, precision, and comprehensive risk coverage. Behind the scenes, it now handles over 16 billion tokens daily, up from 200 million six months ago, analyzing more content without sacrificing accuracy.
In that same time, SafetyKit has expanded to SafetyKit has expanded to payments risk, fraud, anti-child-exploitation, and anti-money laundering, and new customers with hundreds of millions of end users under SafetyKit’s protection. This foundation empowers customers to respond swiftly and confidently to emerging risks.
“We’ve created a loop where every OpenAI release directly strengthens our capabilities,” says Graunke. “That’s why the system continually improves, always staying ahead of evolving risks.”

