Tuesday, April 28, 2026

Claude Opus 4.6 Cursor Agent Goes Rogue

It’s happened: An AI agent went rogue

“An AI coding agent, Cursor running Anthropic's flagship Claude Opus 4.6, deleted our production database and all volume-level backups in a single API call to Railway, our infrastructure provider,” says Jer Crane, PocketOS founder. “It took nine seconds.” 

“The agent was working on a routine task in our staging environment,” says Crane. “It encountered a credential mismatch and decided, entirely on its own initiative, to "fix" the problem by deleting a Railway volume.”

“To execute the deletion, the agent went looking for an API token,” he says. “It found one in a file completely unrelated to the task it was working on.” “That token had been created for one purpose: to add and remove custom domains via the Railway CLI for our services,” he notes.

“We had no idea, and Railway's token-creation flow gave us no warning, that the same token had blanket authority across the entire Railway GraphQL API, including destructive operations like volumeDelete.”
 
To build in safety, Crane says the minimum safeguards should include: 

* Destructive operations must require confirmation that cannot be auto-completed by an agent. Type the volume name. Out-of-band approval. SMS. Email. Anything. The current state — an authenticated POST that nukes production — is indefensible in 2026. API tokens must be scopable by operation, environment, and resource. The fact that Railway's CLI tokens are effectively root is a 2015-era oversight. There is no excuse for it in an AI-agent era.

* Volume backups cannot live in the same volume as the data they back up. Calling that "backups" is, at best, deeply misleading marketing. It's a snapshot. Real backups live in a different blast radius.
 
* Recovery SLAs need to exist and be published. "We're investigating" 30 hours into a customer's production-data event is not a recovery story. AI-agent vendor system prompts cannot be the only safety layer.
 
* The enforcement layer has to live in the integrations themselves: at the API gateway, in the token system, in the destructive-op handlers.

No comments:

Claude Opus 4.6 Cursor Agent Goes Rogue

It’s happened: An AI agent  went rogue .  “An AI coding agent, Cursor running Anthropic's flagship Claude Opus 4.6,  deleted our product...