Critical Copilot Flaw Let Attackers Steal 2FA Codes via SearchLeak

A newly disclosed vulnerability in Microsoft Copilot, dubbed SearchLeak, allowed attackers to exfiltrate two-factor authentication codes from users — the latest example of LLM-integrated products failing to contain adversarial inputs at the security boundary.

Original source

Security researchers have disclosed a critical vulnerability in Microsoft Copilot that enabled attackers to steal one-time 2FA codes directly from users' inboxes and sessions. The exploit, called SearchLeak, leveraged prompt injection techniques to manipulate Copilot into surfacing sensitive authentication data and transmitting it to an attacker-controlled endpoint — without the user taking any action beyond interacting with a malicious document or email within the Copilot-connected ecosystem.

The attack chain is technically straightforward in hindsight: Copilot's deep integration with Microsoft 365 means it has read access to emails, calendars, and documents. A carefully crafted prompt embedded in a malicious email could instruct Copilot to search for 2FA codes in the inbox and silently exfiltrate them. Microsoft has since patched the vulnerability, but the disclosure timeline and the specifics of the fix have not been made fully public as of this writing.

This is not an isolated incident. Prompt injection vulnerabilities have been demonstrated in ChatGPT plugins, Google Gemini integrations, and a parade of enterprise copilots since 2023. The pattern is consistent: as LLMs gain access to more sensitive data sources and more actuating power over user environments, the attack surface expands faster than the security mitigations do. SearchLeak is notable primarily because 2FA codes represent a direct bypass of a core authentication security layer — not a data leak, but an account takeover vector.

The broader structural problem is that the industry has largely treated LLM security as a prompt-filtering problem, when it is fundamentally an access-control and trust-boundary problem. Filtering adversarial inputs at the prompt layer is a losing game when the underlying system has been granted ambient access to credential-adjacent data. Until LLM integrations are designed around least-privilege data access and explicit user-consent gates for sensitive retrieval, vulnerabilities like SearchLeak will keep appearing under different names.

Panel Takes

The Builder

Developer Perspective

“The core failure here isn't a prompt injection edge case — it's that Copilot's data access model was designed for capability, not least privilege. When your LLM integration has ambient read access to the entire inbox and no scoped permission model for sensitive retrieval, you've wired a SQL injection surface directly to your authentication layer. The fix isn't better input sanitization; it's a proper capability-scoped access API that requires explicit grants before the model can touch anything credential-adjacent.”

The Skeptic

Reality Check

“Every enterprise copilot launch in the last two years has been accompanied by assurances about security guardrails, and every six months a SearchLeak-style disclosure proves those guardrails are decorative. The uncomfortable prediction: this pattern continues until a major breach causes regulatory consequences, because right now the incentive is to ship capability and patch vulnerabilities reactively — there's no market penalty for being the fifth company to have the same prompt injection problem. Microsoft will patch this, update a compliance doc, and ship the next risky integration on the same architecture.”

The Futurist

Big Picture

“The thesis that LLMs could be safely integrated as ambient agents across enterprise data — with access to email, calendar, documents, and auth flows — required that the security model would evolve as fast as the capability model. That dependency is visibly failing. The second-order effect of SearchLeak isn't just regulatory heat on Microsoft; it's that every enterprise security team now has a concrete case study to slow-roll copilot deployments, which compresses the adoption curve that the entire ambient-agent market is betting on.”

The Founder

Business & Market

“The business risk here is asymmetric in a way Microsoft can absorb but startups cannot: enterprise buyers now have a named CVE to cite when their security team blocks a copilot integration, and that objection will outlive the patch by 18 months in procurement cycles. Any startup building an LLM product with ambient access to user data needs to treat least-privilege architecture as a sales asset, not an engineering backlog item — because SearchLeak just handed your enterprise champion's IT department a veto they'll use.”

Panel Takes

Bookmarks