Google’s threat hunters say they’ve intercepted the first zero-day exploit they can confidently tie to AI assistance, blocking what was shaping up to be a mass exploitation campaign against a widely used admin tool. The Verge AI reports that Google Threat Intelligence Group (GTIG) flagged the operation after spotting telltale signs of large language model involvement inside the attackers’ Python payload. This is the moment researchers have been warning about for two years, and it just landed.
What Google Found
According to The Verge AI, GTIG attributed the operation to “prominent cyber crime threat actors” who were preparing to bypass two-factor authentication on an unnamed open-source, web-based system administration tool. The exploit targeted a high-level semantic logic flaw, specifically a hardcoded trust assumption inside the platform’s 2FA flow. That’s not a buffer overflow or a missing input check. That’s a design-level mistake that requires real reasoning to spot and weaponize.
The AI fingerprints inside the script were the giveaway:
- A “hallucinated” CVSS score baked into the code
- “Structured, textbook” formatting consistent with LLM training data
- Comments and structure that read like model output rather than handwritten attacker notes
Google’s researchers were careful to note they “do not believe Gemini was used” to build the exploit. They didn’t name which model the attackers leaned on.
Why This Matters
For most of the past year, the conversation about offensive AI has been theoretical. Could a model find a real bug? Could it write a working exploit? Could it chain steps together without a human handholding it? GTIG’s report moves the goalposts. The answer, at least for one operation that got caught, is yes.
This lands on the heels of weeks of public debate over cybersecurity-focused models like Anthropic’s Mythos, plus a recently disclosed Linux vulnerability that AI helped uncover. Defenders have been using AI to find bugs faster. Now attackers are doing the same thing, and they’re doing it at scale.
How the Attackers Are Using AI
GTIG’s report, as detailed in The Verge AI, lays out a few specific tradecraft patterns worth watching:
- Persona-driven jailbreaking. Attackers prompt models to roleplay as security experts, which loosens safety guardrails and gets the model to produce vulnerability research it would otherwise refuse.
- Whole-repo feeding. Hackers are dumping entire vulnerability databases and codebases into model context, then asking for exploitation paths.
- OpenClaw experimentation. Google says it’s seeing signs of attackers refining AI-generated payloads in controlled environments before deploying them, which means they’re treating exploit development like a tuning problem.
There’s also a second front. GTIG observed adversaries increasingly targeting the integrated components that give AI systems their utility, including autonomous skills and third-party data connectors. The AI stack itself is now the target, not just the tool.
What Practitioners Should Take Away
This is the first publicly confirmed AI-assisted zero-day operation Google has caught. It won’t be the last. A few things to expect:
- Patching windows will shrink. If attackers can ask a model to scan for hardcoded trust assumptions across an open-source project, the time between disclosure and weaponization collapses.
- Logic flaws become the new low-hanging fruit. Memory bugs are mostly handled by modern toolchains. Semantic flaws like this 2FA bypass are exactly what LLMs are good at surfacing.
- AI infrastructure needs a threat model. If your product ships agent skills or data connectors, treat them like attack surface. Because someone already is.
Google says it disrupted this specific exploit chain before it went live. The bigger story is the playbook those attackers were running, and the fact that it worked well enough to nearly ship. Defenders who haven’t started building AI into their detection pipelines are now playing from behind.
Full details in the original report from The Verge AI.