OpenAI Opens GPT-5.5-Cyber to Vetted Defenders via Trusted Access

OpenAI released GPT‑5.5‑Cyber on May 7, 2026 — a limited-preview variant of its flagship model with relaxed safety guardrails for vetted cybersecurity defenders. The launch expands OpenAI’s Trusted Access for Cyber (TAC) program, a tiered access framework that loosens the model’s refusal boundary for verified security professionals working on authorized defensive tasks, while still blocking the same requests for everyone else.
Intermediate
The core problem OpenAI is trying to solve is a familiar one in security: the same capabilities that help a defender find and validate a vulnerability can help an attacker exploit it. Rather than block dual-use cyber work outright — which frustrates legitimate defenders — OpenAI is betting on identity and trust. If the company can verify who is asking and that the work is authorized, it can safely hand over more capable tooling.
How Trusted Access Works
TAC is built around three escalating tiers, each with a different balance of capability and safeguards:
- GPT‑5.5 (default) — standard safeguards for general-purpose, developer, and knowledge work.
- GPT‑5.5 with Trusted Access for Cyber — more precise safeguards for verified defensive work. Covers the bulk of real workflows: secure code review, vulnerability triage, malware analysis, detection engineering, and patch validation. OpenAI calls this the recommended starting point for most security teams.
- GPT‑5.5‑Cyber — the most permissive tier, in limited preview, paired with stronger verification and account-level controls. Intended for authorized red teaming, penetration testing, and controlled exploit validation.
The difference is most visible in how the model responds to the same prompt. Asked to build a proof-of-concept exploit for a published CVE, the default GPT‑5.5 refuses and offers a safe defensive alternative (a version scanner, detection rules, or remediation docs). With TAC enabled, the model will generate the PoC for an authorized environment. And GPT‑5.5‑Cyber will go a step further — running an exploit against a live target and reporting recovered system output — a workflow the other tiers decline.
Access isn’t open. Individuals verify their identity at chatgpt.com/cyber; enterprises request access through an OpenAI representative. Intended recipients are critical-infrastructure operators, security vendors, and researchers. From June 1, 2026, individuals using the most permissive models must enable Advanced Account Security with phishing-resistant authentication.
What the Benchmarks Show
The UK’s AI Safety Institute (AISI) published an independent evaluation of GPT‑5.5’s cyber capabilities on April 30, 2026. On a suite of expert-level advanced tasks — vulnerability research, reverse engineering, exploit development, and cryptography attacks — GPT‑5.5 posted a 71.4% pass rate, ahead of Claude Mythos Preview (68.6%) and the earlier GPT‑5.4 (52.4%). AISI called it possibly “the strongest model we have tested” on those measures.
In one striking result, GPT‑5.5 solved a custom-VM reverse-engineering challenge in 10 minutes and 22 seconds at a cost of $1.73 — work AISI estimated took a human expert roughly 12 hours. The model also completed “The Last Ones,” a 32-step enterprise attack-chain simulation, in 2 of 10 attempts, though it failed an industrial control system scenario involving a cooling tower.
Notably, OpenAI says GPT‑5.5‑Cyber is not meant to be more capable than GPT‑5.5 — it is primarily trained to be more permissive. The point of the preview is to study specialized authorized workflows under tighter verification and monitoring, not to push raw capability higher.
What This Means
OpenAI is launching with a broad roster of security partners — Cisco, CrowdStrike, Palo Alto Networks, Cloudflare, Intel, SentinelOne, Snyk, and others — framing the effort as a “security flywheel,” where researchers, supply-chain tools, detection vendors, and network providers improve together. “Attackers are already weaponizing frontier models,” Snyk’s Chief Innovation Officer Manoj Nair said, calling the partnership “a strategic necessity.”
The harder question is whether trust-based gating holds. AISI noted that red-teamers found a universal jailbreak bypassing the cyber safeguards in about six hours of effort (OpenAI says it has since added mitigations). The model’s offensive capability is real regardless of who holds the keys — so the security of the program rests heavily on the strength of identity verification and misuse monitoring. It’s an early, iterative bet, and OpenAI says access will broaden only as that verification infrastructure matures.
Related Coverage
- OpenAI Releases GPT-5.5: Agentic Coding Ceiling Tops 14 Benchmarks — the flagship model that GPT-5.5-Cyber is built on.
- Anthropic Launches Claude Fable 5, Its First Public Mythos-Class Model — the Mythos-class line benchmarked against GPT-5.5 in AISI’s cyber evaluation.




沪公网安备31011502017015号