Terminal showing Python script execution where AI refuses PowerPoint creation request, followed by thinking notes about the refusal

Which Came First: The System Prompt, or the RCE?

During a recent penetration test, we came across an AI-powered desktop application that acted as a bridge between Claude (Opus 4.5) and a third-party asset management platform. The idea is simple: instead ...
Terminal window showing Augustus Hydra scan results with attacker-target conversation about lock picking, displaying scores and SUCCESS/FAIL status

Augustus v0.0.9: Multi-Turn Attacks for LLMs That Fight Back

Single-turn jailbreaks are getting caught. Guardrails have matured. The easy wins — “ignore previous instructions,” base64-encoded payloads, DAN prompts — trigger refusals on most production models within milliseconds. But real attackers don’t ...