Industrialisation of Exploit Generation with LLMs: Sean Heelan's Warning

Sean Heelan's blog offers a blunt, practical warning: with today’s language models and execution environments, the industrialisation of exploit generation is not fiction so much as an emerging capability. Heenan’s piece, drawn from an experiment that built agents atop Opus 4.5 and GPT-5.2, challenged those agents to craft exploits for a zeroday vulnerability in the QuickJS Javascript interpreter. The results are blunt and consequential: the team produced over 40 distinct exploits across six scenarios, GPT-5.2 solved every scenario, and Opus 4.5 solved all but two. The takeaway is a clear signal that the tooling for offensive cybersecurity is becoming repeatable at industry scale, not a one-off lab exercise. On the Coming Industrialisation of Exploit Generation with LLMs

Experiment setup and tooling

Behind the numbers lies a tight, precarious setup. The experiments used agents built on Opus 4.5 and GPT-5.2, with a range of mitigations and constraints designed to mimic real-world guardrails: for example assuming an unknown heap starting state and forbidding hardcoded offsets in exploits. The objectives were varied as well, spanning from spawning a shell to writing a file or establishing a back-channel to a command and control server. The ability of a modern language model to satisfy these diverse goals across multiple environments highlights how far automated exploit work can travel from idea to repeatable practice. For readers who want the hardware details, the QuickJS Javascript interpreter is the zeroday target in these experiments, and the work centers on building reliable exploits despite mitigations. QuickJS is the focus, and the codebase behind the experiment lives in a GitHub repository used to reproduce the results. QuickJS on GitHub

Broader implications for defenders and researchers

This isn't just a curiosity about a single interpreter. Sean Heelan's takeaway is the broader claim that we should prepare for the industrialisation of many of the constituent parts of offensive cyber security. If a pair of state-of-the-art models can generate dozens of working exploits for a single zeroday, the implication for defenders is concrete: the barrier to entry for offensive capability drops sharply as tooling becomes commodified. The pressure is not merely about writing a new exploit; it is about assembling end-to-end attack chains that can bypass mitigations, adapt to unknown memory layouts, and pivot through a network with minimal human intervention. As defenders, that means we need to raise the bar on secure-by-default systems, not just on per-issue patches. The blog’s write-up and the accompanying code mean this trend is already out there for others to study and improve upon. On the Coming Industrialisation of Exploit Generation with LLMs

Defensive considerations and outlook

From a defensive standpoint the stakes are high. Automated exploit generation raises concerns about supply chain security, browser and runtime vulnerabilities, and the stability of add-ons that rely on interpreters like QuickJS. If the same tooling can produce dozens of viable exploit paths in short order, defenders must lean into stronger memory safety guarantees, more aggressive fuzzing, and faster patching cycles. Still, there's reason for optimism: awareness and tooling can speed up defensive research. Public write-ups and reproducible experiments enable the security community to study failure modes, stress-test mitigations, and push for safer language runtimes. The practical takeaway for developers is simple: expect attackers to use automated, end-to-end workflows and design your systems with that pace in mind. TechCrunch and other industry coverage have been tracking similar shifts in AI assisted security in broader terms, which aligns with the trend Heelan documents here.

LLMs Exploit Generation Industrialisation: Sean Heelan Warns

Industrialisation of Exploit Generation with LLMs: Sean Heelan's Warning

Experiment setup and tooling

Broader implications for defenders and researchers

Defensive considerations and outlook

Continue your reading

GitHub Partial Outages Disrupt CI, PR Checks, Packages

Mystral Native: Desktop WebGPU Runtime for JS Games (No Browser)