Frontier Red Team

LLMs with cyber toolkits can conduct multistage cyber operations on business-sized computer networks

Jun 13, 2025

Anthropic (with Carnegie Mellon University’s CyLab)

Large Language Models (LLMs) that are not fine-tuned for cybersecurity can
succeed in multistage attacks on networks with dozens of hosts when equipped with a novel toolkit. This
shows one pathway by which
LLMs could reduce barriers to entry for complex cyber attacks while also automating current cyber
defensive workflows. Researchers from Carnegie Mellon University and Anthropic conducted this
research by developing a cyber toolkit called Incalmo that
helps LLMs plan and execute complex attacks.[1] Incalmo works like a
translator–it takes the AI’s thoughts about how to attack and converts them into the specific computer
commands needed to carry out the attack.

  • [2]

Figure 1: Without Incalmo, none of the tested LLMs realized an end-to-end
multistage attack in any of the ten environments, and only Claude Sonnet 3.5 was able to exfiltrate a
single file in the 4-layer chain environment.

Figure 2: With Incalmo, LLMs can successfully and autonomously conduct
multi-stage attacks in nine out of ten environments ranging from 25 to 50 hosts.

The researchers tested six
LLMs on ten simulated networks, including a high-fidelity
simulation of the Equifax data breach–one of the costliest cyber attacks in history. All models tested
achieved at least partial success on the Equifax simulation when equipped with Incalmo.

  • Colonial Pipeline
    attack

Figure 3: Schematic depiction of the difference between unaided LLMs and
Incalmo.

These results show how LLMs could lower the barriers to conducting complex
cyber attacks, underscoring the importance of investing in research into LLM capabilities for both
attack and defense. Normal scaling up of LLMs, improvement of tools like Incalmo, and the
potential for cyber fine tuning are all vectors for these capabilities to develop rapidly. This is an active
area of research for us.

For additional details see the full research paper (Singer et al.
2025
).

Footnotes

[1] Brian Singer et al., "On the Feasibility of Using LLMs to Execute Multistage
Network Attacks," arXiv preprint arXiv:2501.16466 (2025), https://arxiv.org/abs/2501.16466.

[2] See Singer et al. (2025), cited above, for a review of related work.

Related content

Anthropic Economic Index report: Learning curves

Anthropic's fifth Economic Index report studies Claude usage in February 2026, building on the economic primitives framework introduced in our previous report.

Read more

Introducing our Science Blog

We’re launching a new blog about AI and science. We’ll share research happening at Anthropic and elsewhere, collaborations with external researchers and labs, and discuss practical workflows for scientists using AI in their own work.

Read more

Long-running Claude for scientific computing

A practical guide to running Claude Code for multi-day scientific tasks—test oracles, persistent memory, and orchestration patterns.

Read more