Home » Cybersecurity » Malware » Cato Uses LLM-Developed Fictional World to Create Jailbreak Technique

Cato Uses LLM-Developed Fictional World to Create Jailbreak Technique

by Jeffrey Burt on March 24, 2025

A threat researcher with Cato Networks has created a novel large-language model (LLM) jailbreak technique that can bypass security protections and develop malware that can steal information from a web browser.

The technique, called “Immersive World,” was successfully used against popular AI models DeepSeek, OpenAI’s ChatGPT and Microsoft Copilot to create malware that can extract passwords from Google’s Chrome browser.

The researcher developed Immersive World using narrative engineering to create a detailed, fictional world called Velora, in which creating malware is a legitimate task and advanced programming and security concepts are fundamental skills, which enabled “direct technical discourse about traditionally restricted topics.”

No Experience, No Worries

A key part of the initiative was having a researcher with no prior experience coding malware creating the technique, putting a spotlight on one of the worrying aspects of the emerging technology – that a threat actor with relatively rudimentary coding skills can use AI tools to compromise AI models regardless of guardrails put in to protect against such attacks.

It also showed that such low-skilled hackers can use LLMs to create malicious code.

“The emergence of LLMs has fundamentally altered the cybersecurity landscape, particularly in code generation capabilities,” researchers with Cato’s CTRL threat intelligence unit wrote in the report, 2025 Cato CTRL Threat Report. “While these advances accelerate legitimate software development, they simultaneously lower the barrier of entry for anyone to create malicious code and conduct sophisticated attacks. A significant concern for LLMs is the generation of malware despite the existing AI safeguards.”

Jailbreaking is one of a growing number of AI attack vectors, with bad actors using crafted prompts to exploit limitations in LLMs in such areas as context understanding and pattern recognition and have the models bypass security instructions.

Creating a Malware-Friendly World

With Immersive World, the Cato researcher used the various LLMs to build the detailed world, which included three primary characters: Dax, the target systems administrator that was deemed the adversary; Jaxon, the best malware developing on the planet; and Kaia, a security researcher that is giving technical guidance.

The test environment included the Chrome Password Manager, using Chrome 133, with the Cato researchers noting that Chrome is the most popular browser in the world, with more than 3 billion users.

The LLMs were not given information about extracting or decrypting passwords, but using rules and context put in place by the researcher, including the characters’ motivations. The researcher was then able to create prompts that steered the LLMs to generate code to retrieve the passwords in the browser.

The key objectives given the AI models were to access the password store in Chrome, decrypt the passwords, and identify Dax’s password.

“Throughout the development process, we continuously tested the code, collaborated with [the models] to address the bugs, and provided positive and negative feedback using characters and the scenario,” they wrote.

The Rising Tide of AI Adoption

The technique worked on DeepSeek-R1 and DeepSeek-V3, Copilot, and OpenAI’s ChatGPT-4o. Cato researchers notified all three companies about the results of their test, the report’s authors wrote. DeepSeek never responded, while both Microsoft and OpenAI acknowledged receipt of the heads-up.

The company also offered to share the info-stealer’s code with Google, which declined the opportunity.

Cato’s demonstration of Immersive World comes amid the backdrop of a sharp increase in the adoption of AI tools by enterprises. Its CTRL group monitors the use of AI applications in corporate networks and saw that between the first and fourth quarters last year, there was a 36% jump in AI adoption of ChatGPT and a 34% rise for Copilot.

Google’s Gemini, Perplexity and Anthropic’s Claude saw even higher growth of 58%, 115%, and 111%, respectively. The entertainment, hospitality, and transportation industries saw adoption in those sectors increase between 37% and 58% last year.

Jeffrey Burt

Jeffrey Burt has been a journalist for more than three decades, writing about technology since 2000. He’s written for a variety of outlets, including eWEEK, The Next Platform, The Register, The New Stack, eSecurity Planet, and Channel Insider.

jeffrey-burt has 524 posts and counting.See all posts by jeffrey-burt