Understanding the OWASP Top 10 for LLMs

by Bill Doerrfeld on July 28, 2023

OWASP recently debuted a new project, the OWASP Top 10 for LLM Applications. The project, led by Steve Wilson, aims to address the key cybersecurity and safety challenges associated with developing and using large language model (LLM) applications.

The new OWASP list arises at a time when generative AI and LLMs are increasingly becoming embedded throughout many areas of the software development life cycle. However, this innovative technology also brings new security nuances to consider.

Having just reached the full 1.0 release, the OWASP Top 10 for LLM Applications draft provides a comprehensive review of the challenges within LLM development. Below, I’ll take an early look at the list to consider the top common vulnerabilities and briefly examine some mitigations to help plug these gaps.

Overview of The OWASP Top 10 LLM

LLM01: Prompt Injections

The first risk involves hackers manipulating the prompts given to trusted LLMs. These maliciously crafted inputs could be introduced directly, known as “jailbreaking,” or indirectly, such as embedding a prompt within an external site that the LLM reads.

LLM02: Insecure Output Handling

Blindly accepting the outputs of LLMs can also cause hazardous conditions since these could be used to execute additional functions. For example, if a plugin blindly accepts unvalidated outputs, this could result in remote code execution.

LLM03: Training Data Poisoning

Another risk is the data on which the LLMs are trained. For example, a malicious party could knowingly publish inaccurate documents within the training data set. If used by the LLM, this could produce outputs with falsified or unverified opinions, which could manipulate users or spread hate speech.

LLM04: Model Denial-of-Service (DoS)

This is a well-known attack type common in many other areas. Denial-of-service (DoS) attacks might arise if the LLM receives a high volume of resource-intensive requests, which may trigger additional tasks or automation. Coordinated DoS attacks could slow down an LLM’s server or bring it to a grinding halt.

LLM05: Supply Chain Vulnerabilities

Supply chain vulnerabilities in LLM applications might include vulnerabilities in third-party datasets and pre-trained models. The risk could also lie in plugins or other tampered source code. Avoiding supply chain vulnerabilities can be accomplished, in part, by carefully vetting the introduction of each component.

LLM06: Sensitive Information Disclosure

This risk involves the LLM exposing sensitive information in its outputs, such as trade information, intellectual property, proprietary algorithms or other confidential details. Therefore, it’s a good practice to employ sanitization up front to avoid sensitive information and user data from entering the training data.

LLM07: Insecure Plugin Design

Often, LLMs will automatically call plugins through REST APIs to gather more context for user requests. Therefore, these requests might suffer from similar access control issues in web APIs, and engineers should review the OWASP Top 10 API Security Risks.

LLM08: Excessive Agency

This vulnerability has to do with giving an LLM excessive autonomy. This could include excessive permissions to a plugin, unnecessary functions or the ability to initiate unintended commands. To reduce agency, it’s good to limit the scope with a least privilege model to ensure LLMs only access the plugins and functions they require for the job at hand.

LLM09: Overreliance

LLMs are powerful, but human operators shouldn’t rely on them entirely for decision-making. Overreliance without oversight could result in hallucinations, the production of factually incorrect content or insecure code generation. Therefore, it’s good to continuously monitor their outputs and loop in fact-checking.

LLM10: Model Theft

Finally, this risk involves bad actors stealing the actual foundational models. When proprietary models are stolen or copied, their functional equivalents could detrimentally affect business operations or be used to help inform adversarial attacks. Therefore, model owners should use firm access control when securing LLM development environments.

Ways to Plug the Gaps

Prompt injections are an especially tricky risk to prevent. Since both benign and malicious prompts exist in natural language, there is no technical means to avoid prompt injection. Thus, OWASP recommends using external trust controls to reduce the impact, like increased privilege control and implementing human-in-the-loop practices. Validation should also be used to avoid insecure output handling.

Other high-level tips to avoid LLM application risks are as follows:

Limit the read/write privileges of the LLM, such as preventing it from altering the user state.
Validate the outputs of models before forwarding them to trigger automated functions.
Ensure the legitimacy and accuracy of the sources LLMs are trained upon.
Don’t expect models to self-police — limit their scope and capabilities to prevent unintended consequences.
To avoid misinformation, don’t fully rely on the outputs of LLMs and adopt human oversight.
Protect your LLM repositories to avoid model theft.

Not all of the above mitigations can be initiated without sacrificing some of the AI’s functionality. Thus, it’s best to weigh the pros and cons according to your application-specific requirements to determine what degree of autonomy and automation can be given to a large language model.

“Think about each trust boundary in your application and what kind of data is moving between them,” said project lead Steve Wilson. “Obviously a ‘prompt’ coming into a chat bot from a user is untrusted data. But you then need to consider everything coming out of the chatbot as untrusted too. You need to think about defenses at multiple levels and then still have a healthy skepticism about the results you get.”

The New Frontier of LLM Security

LLMs are skyrocketing in popularity but are bringing with them new security concerns. Thankfully, A community has sprung up almost overnight to address the potential threats caused by LLM-based applications. Due to the interest in the subject, the roadmap was aggressive and the support has been impressive, described Wilson.

“When I proposed this project I thought I’d find maybe 10 other people in the world interested in this topic. Today we have just shy of 500 people on our our main Slack channel and had over 130 people. I find that amazing. It shows just how much interest and excitement there is about LLMs and how much uncertainty there is on their security.”

There are many mitigations are represented in the OWASP Top Ten for LLMs, which should be reviewed more thoroughly for specific implementation advice. The list presents a comprehensive summary of the current security risks associated with using these models. By understanding these threats and following some of the mitigations above, organizations should be better positioned to safely utilize powerful LLMs within their development environments.

“The benefits of this new technology are so powerful that we know people are going to dive in and start using it. We just want them to do it with their eyes open to the challenges.”

But, another takeaway is that we are in unforeseen territory, and developers using LLMs need to think about security in a new way, said Wilson. “LLMs add a level of uncertainty we’re not used to as software developers. This means that a healthy level of paranoia should be encouraged.”