Anthropic says DeepSeek, Moonshot, and MiniMax used 24,000 fake accounts to rip off Claude
Anthropic dropped a bombshell on the artificial intelligence industry Monday, publicly accusing three prominent Chinese AI laboratories — DeepSeek, Moonshot AI, and MiniMax — of orchestrating coordinated, industrial-scale campaigns to siphon capabilities from its Claude models using tens of thousands of fraudulent accounts.
The San Francisco-based company said the three labs collectively generated more than 16 million exchanges with Claude through approximately 24,000 fake accounts, all in violation of Anthropic's terms of service and regional access restrictions. The campaigns, Anthropic said, are the most concrete and detailed public evidence to date of a practice that has haunted Silicon Valley for months: foreign competitors systematically using a technique called distillation to leapfrog years of research and billions of dollars in investment.
"These campaigns are growing in intensity and sophistication," Anthropic wrote in a technical blog post published Monday. "The window to act is narrow, and the threat extends beyond any single company or region. Addressing it will require rapid, coordinated action among industry players, policymakers, and the global AI community."
The disclosure marks a dramatic escalation in the simmering tensions between American and Chinese AI developers — and it arrives at a moment when Washington is actively debating whether to tighten or loosen export controls on the advanced chips that power AI training. Anthropic, led by CEO Dario Amodei, has been among the most vocal advocates for restricting chip sales to China, and the company explicitly connected Monday's revelations to that policy fight.
How AI distillation went from obscure research technique to geopolitical flashpoint
To understand what Anthropic alleges, it helps to understand what distillation actually is — and how it evolved from an academic curiosity into the most contentious issue in the global AI race.
At its core, distillation is a process of extracting knowledge from a larger, more powerful AI model — the "teacher" — to create a smaller, more efficient one — the "student." The student model learns not from raw data, but from the teacher's outputs: its answers, reasoning patterns, and behaviors. Done correctly, the student can achieve performance remarkably close to the teacher's while requiring a fraction of the compute to train.
As Anthropic itself acknowledged, distillation is "a widely used and legitimate training method." Frontier AI labs, including Anthropic, routinely distill their own models to create smaller, cheaper versions for customers. But the same technique can be weaponized. A competitor can pose as a legitimate customer, bombard a frontier model with carefully crafted prompts, collect the outputs, and use those outputs to train a rival system — capturing capabilities that took years and hundreds of millions of dollars to develop.
The technique burst into public consciousness in January 2025 when DeepSeek released its R1 reasoning model, which appeared to match or approach the performance of leading American models at dramatically lower cost. Databricks CEO Ali Ghodsi captured the industry's anxiety at the time, telling CNBC: "This distillation technique is just so extremely powerful and so extremely cheap, and it's just available to anyone." He predicted the technique would usher in an era of intense competition for large language models.
That prediction proved prescient. In the weeks following DeepSeek's release, researchers at UC Berkeley said they recreated OpenAI's reasoning model for just $450 in 19 hours. Researchers at Stanford and the University of Washington followed with their own version built in 26 minutes for under $50 in compute credits. The startup Hugging Face replicated OpenAI's Deep Research feature as a 24-hour coding challenge. DeepSeek itself openly released a family of distilled models on Hugging Face — including versions built on top of Qwen and Llama architectures — under the permissive MIT license, with the model card explicitly stating that the DeepSeek-R1 series supports commercial use and allows for any modifications and derivative works, "including, but not limited to, distillation for training other LLMs."
But what Anthropic described Monday goes far beyond academic replication or open-source experimentation. The company detailed what it characterized as deliberate, covert, and large-scale intellectual property extraction by well-resourced commercial laboratories operating under the jurisdiction of the Chinese government.
Anthropic traces 16 million fraudulent exchanges to researchers at DeepSeek, Moonshot, and MiniMax
Anthropic attributed each campaign "with high confidence" through IP address correlation, request metadata, infrastructure indicators, and corroboration from unnamed industry partners who observed the same actors on their own platforms. Each campaign specifically targeted what Anthropic described as Claude's most differentiated capabilities: agentic reasoning, tool use, and coding.
DeepSeek, the company that ignited the distillation debate, conducted what Anthropic described as the most technically sophisticated of the three operations, generating over 150,000 exchanges with Claude. Anthropic said DeepSeek's prompts targeted reasoning capabilities, rubric-based grading tasks designed to make Claude function as a reward model for reinforcement learning, and — in a detail likely to draw particular political attention — the creation of "censorship-safe alternatives to policy sensitive queries."
Anthropic alleged that DeepSeek "generated synchronized traffic across accounts" with "identical patterns, shared payment methods, and coordinated timing" that suggested load balancing to maximize throughput while evading detection. In one particularly notable technique, Anthropic said DeepSeek's prompts "asked Claude to imagine and articulate the internal reasoning behind a completed response and write it out step by step — effectively generating chain-of-thought training data at scale." The company also alleged it observed tasks in which Claude was used to generate alternatives to politically sensitive queries about "dissidents, party leaders, or authoritarianism," likely to train DeepSeek's own models to steer conversations away from censored topics. Anthropic said it was able to trace these accounts to specific researchers at the lab.
Moonshot AI, the Beijing-based creator of the Kimi models, ran the second-largest operation by volume at over 3.4 million exchanges. Anthropic said Moonshot targeted agentic reasoning and tool use, coding and data analysis, computer-use agent development, and computer vision. The company employed "hundreds of fraudulent accounts spanning multiple access pathways," making the campaign harder to detect as a coordinated operation. Anthropic attributed the campaign through request metadata that "matched the public profiles of senior Moonshot staff." In a later phase, Anthropic said, Moonshot adopted a more targeted approach, "attempting to extract and reconstruct Claude's reasoning traces."
MiniMax, the least publicly known of the three but the most prolific by volume, generated over 13 million exchanges — more than three-quarters of the total. Anthropic said MiniMax's campaign focused on agentic coding, tool use, and orchestration. The company said it detected MiniMax's campaign while it was still active, "before MiniMax released the model it was training," giving Anthropic "unprecedented visibility into the life cycle of distillation attacks, from data generation through to model launch." In a detail that underscores the urgency and opportunism Anthropic alleges, the company said that when it released a new model during MiniMax's active campaign, MiniMax "pivoted within 24 hours, redirecting nearly half their traffic to capture capabilities from our latest system."
How proxy networks and 'hydra cluster' architectures helped Chinese labs bypass Anthropic's China ban
Anthropic does not currently offer commercial access to Claude in China, a policy it maintains for national security reasons. So how did these labs access the models at all?
The answer, Anthropic said, lies in commercial proxy services that resell access to Claude and other frontier AI models at scale. Anthropic described these services as running what it calls "hydra cluster" architectures — sprawling networks of fraudulent accounts that distribute traffic across Anthropic's API and third-party cloud platforms. "The breadth of these networks means that there are no single points of failure," Anthropic wrote. "When one account is banned, a new one takes its place." In one case, Anthropic said, a single proxy network managed more than 20,000 fraudulent accounts simultaneously, mixing distillation traffic with unrelated customer requests to make detection harder.
The description suggests a mature and well-resourced infrastructure ecosystem dedicated to circumventing access controls — one that may serve many more clients than just the three labs Anthropic named.
Why Anthropic framed distillation as a national security crisis, not just an IP dispute
Anthropic did not treat this as a mere terms-of-service violation. The company embedded its technical disclosure within an explicit national security argument, warning that "illicitly distilled models lack necessary safeguards, creating significant national security risks."
The company argued that models built through illicit distillation are "unlikely to retain" the safety guardrails that American companies build into their systems — protections designed to prevent AI from being used to develop bioweapons, carry out cyberattacks, or enable mass surveillance. "Foreign labs that distill American models can then feed these unprotected capabilities into military, intelligence, and surveillance systems," Anthropic wrote, "enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance."
This framing directly connects to the chip export control debate that Amodei has made a centerpiece of his public advocacy. In a detailed essay published in January 2025, Amodei argued that export controls are "the most important determinant of whether we end up in a unipolar or bipolar world" — a world where either only the U.S. and its allies possess the most powerful AI, or one where China achieves parity. He specifically noted at the time that he was "not taking any position on reports of distillation from Western models" and would "just take DeepSeek at their word that they trained it the way they said in the paper."
Monday's disclosure is a sharp departure from that earlier restraint. Anthropic now argues that distillation attacks "undermine" export controls "by allowing foreign labs, including those subject to the control of the Chinese Communist Party, to close the competitive advantage that export controls are designed to preserve through other means." The company went further, asserting that "without visibility into these attacks, the apparently rapid advancements made by these labs are incorrectly taken as evidence that export controls are ineffective." In other words, Anthropic is arguing that what some observers interpreted as proof that Chinese labs can innovate around chip restrictions was actually, in significant part, the result of stealing American capabilities.
The murky legal landscape around AI distillation may explain Anthropic's political strategy
Anthropic's decision to frame this as a national security issue rather than a legal dispute may reflect the difficult reality that intellectual property law offers limited recourse against distillation.
As a March 2025 analysis by the law firm Winston & Strawn noted, "the legal landscape surrounding AI distillation is unclear and evolving." The firm's attorneys observed that proving a copyright claim in this context would be challenging, since it remains unclear whether the outputs of AI models qualify as copyrightable creative expression. The U.S. Copyright Office affirmed in January 2025 that copyright protection requires human authorship, and that "mere provision of prompts does not render the outputs copyrightable."
The legal picture is further complicated by the way frontier labs structure output ownership. OpenAI's terms of use, for instance, assign ownership of model outputs to the user — meaning that even if a company can prove extraction occurred, it may not hold copyrights over the extracted data. Winston & Strawn noted that this dynamic means "even if OpenAI can present enough evidence to show that DeepSeek extracted data from its models, OpenAI likely does not have copyrights over the data." The same logic would almost certainly apply to Anthropic's outputs.
Contract law may offer a more promising avenue. Anthropic's terms of service prohibit the kind of systematic extraction the company describes, and violation of those terms is a more straightforward legal claim than copyright infringement. But enforcing contractual terms against entities operating through proxy services and fraudulent accounts in a foreign jurisdiction presents its own formidable challenges.
This may explain why Anthropic chose the national security frame over a purely legal one. By positioning distillation attacks as threats to export control regimes and democratic security rather than as intellectual property disputes, Anthropic appeals to policymakers and regulators who have tools — sanctions, entity list designations, enhanced export restrictions — that go far beyond what civil litigation could achieve.
What Anthropic's distillation crackdown means for every company running a frontier AI model
Anthropic outlined a multipronged defensive response. The company said it has built classifiers and behavioral fingerprinting systems designed to identify distillation attack patterns in API traffic, including detection of chain-of-thought elicitation used to construct reasoning training data. It is sharing technical indicators with other AI labs, cloud providers, and relevant authorities to build what it described as a more holistic picture of the distillation landscape. The company has also strengthened verification for educational accounts, security research programs, and startup organizations — the pathways most commonly exploited for setting up fraudulent accounts — and is developing model-level safeguards designed to reduce the usefulness of outputs for illicit distillation without degrading the experience for legitimate customers.
But the company acknowledged that "no company can solve this alone," calling for coordinated action across the industry, cloud providers, and policymakers.
The disclosure is likely to reverberate through multiple ongoing policy debates. In Congress, the bipartisan No DeepSeek on Government Devices Act has already been introduced. Federal agencies including NASA have banned DeepSeek from employee devices. And the broader question of chip export controls — which the Trump administration has been weighing amid competing pressures from Nvidia and national security hawks — now has a new and vivid data point.
For the AI industry's technical decision-makers, the implications are immediate and practical. If Anthropic's account is accurate, the proxy infrastructure enabling these attacks is vast, sophisticated, and adaptable — and it is not limited to targeting a single company. Every frontier AI lab with an API is a potential target. The era of treating model access as a simple commercial transaction may be coming to an end, replaced by one in which API security is as strategically important as the model weights themselves.
Anthropic has now put names, numbers, and forensic detail behind accusations that the industry had only whispered about for months. Whether that evidence galvanizes the coordinated response the company is calling for — or simply accelerates an arms race between distillers and defenders — may depend on a question no classifier can answer: whether Washington sees this as an act of espionage or just the cost of doing business in an era when intelligence itself has become a commodity.
