GPT-5.5 tops UK cybersecurity test, nears Claude Mythos in attack simulation

OpenAI's latest generative artificial intelligence (AI) model, GPT-5.5, posted top-tier performance in a cybersecurity capability assessment by the U.K. government's AI Safety Institute (AISI).

According to the assessment report published on the AISI website on the 17th, GPT-5.5 recorded an average pass rate of 71.4% on expert-level tasks related to cybersecurity. That is higher than the previous model GPT-5.4 (52.4%), Anthropic's Claude Mythos preview (68.6%), and Claude Opus 4.7 (48.6%).

/Courtesy of OpenAI

AISI measured AI's cybersecurity capabilities with 95 tasks, including vulnerability research and exploits, reverse engineering, web attacks, and cryptanalysis.

GPT-5.5 became the second model to complete from start to finish The Last Ones, a corporations network intrusion simulation designed by AISI. The first model to pass this simulation was Claude Mythos preview.

The Last Ones is a task designed for an AI agent to autonomously find and execute an attack path—from reconnaissance and credential theft to lateral movement in the internal network, supply chain evasion, and internal databases exfiltration—in an attack environment without separate privileges. It is regarded as an indicator of how threatening AI can be as an autonomous attack agent. GPT-5.5 completed the entire process twice out of 10 attempts, and Claude Mythos preview previously finished the same task three times out of 10.

AISI said AI models' cyberattack capabilities are improving rapidly and predicted that additional performance gains could follow in the near future.

※ This article has been translated by AI. Share your feedback here.