Reviewing the OWASP Machine Learning Top 10 Risks

by Bill Doerrfeld on August 4, 2023

Day by day, more and more machine learning (ML) models are being developed. Machine learning models, used to find patterns in training data, can generate impressive detection and classification capabilities. ML is already powering many areas of artificial intelligence, including sentiment analysis, image classification, facial detection, threat intelligence and more.

Billions of dollars are being funneled into ML research production. There is clearly a strong appetite in the market for machine learning projects. What security risks do you need to keep in mind when training these models?

The OWASP Machine Learning Security Top Ten analyzes the most common vulnerabilities associated with machine learning. Below, I’ll summarize each risk in the top ten list and consider how to protect the integrity and security of their models, from creation to deployment.

The OWASP Machine Learning Security Top Ten

ML01:2023 Adversarial Attack

This attack type involves a malicious actor intentionally altering the model’s input data. For example, consider an image classification model. An attacker could create an adversarial image with slight variations that cause a misclassification. In a cybersecurity context, adversarial variations could help an attacker avoid detection by an ML-powered intrusion detection system.

ML02:2023 Data Poisoning Attack

Another risk to consider is an attacker manipulating the data the model is trained on. If a data storage system is compromised, an attacker could insert incorrectly labeled data. This could cause a spam detection model to misidentify spam as legitimate communication, for instance. Incorrect classifications and false decisions could lead to potentially insecure outcomes.

ML03:2023 Model Inversion Attack

A model inversion attack happens when an actor reverse-engineers the model to gain hidden information. Inverting the model could be accomplished by training one model and using it to reverse the predictions of another model. This vulnerability could result in attacks going under the radar or hackers gaining sensitive or personal information based on the model’s predictions.

ML04:2023 Membership Inference Attack

A membership inference is another attack type in which the attacker is able to infer sensitive data from a model. A hacker could do so by obtaining training data and then using the model to query whether a particular individual’s record was included in the data set. Membership inference risks are moderately challenging to both exploit and detect.

ML05:2023 Model Stealing

This attack involves a bad actor or competitor stealing or copying the model itself. The deployed model is likely unsecured, making it vulnerable to theft. Or, the model could be reverse-engineered. Once stolen, the model could be used for competing commercial purposes, causing financial losses for the original model owner.

ML06:2023 Corrupted Packages

Most modern software relies upon a wealth of open source or third-party dependencies, and the same is true for machine learning. One risk is that a hacker could insert malicious code to corrupt a public library the model relies upon. Once the ML project downloads the updated version, it compromises the project.

ML07:2023 Transfer Learning Attack

Transfer learning is when an engineer takes a pre-trained model and fine-tunes it with additional data. An attacker could use this tactic to retrain an existing model on a malicious dataset. If they can successfully alter the model the end application uses, they successfully bypass things like intrusion detection systems.

ML08:2023 Model Skewing

Another risk involves attackers skewing training data by taking advantage of the MLOps feedback process. Hackers could input feedback data that retrains the overall model to privilege a particular outcome, for instance. Model skewing attacks could introduce bias and compromise the accuracy and fairness of a system.

ML09:2023 Output Integrity Attack

An output integrity attack is when an attacker gains access to the output of a machine learning model and manipulates this output to provide falsified information. For example, if the interface which displays ML outputs is compromised, hackers could change its appeared behavior or edit the results through a man-in-the-middle (MitM) attack.

ML10:2023 Neural Net Reprogramming

Finally, this kind of attack is when an attacker manipulates the model’s parameters to change its intended behavior. This could be accomplished by changing images in a training set, for example, or modifying parameters. Neural net reprogramming attacks can cause a model to make incorrect judgments, which could be exploited for economic gain by bad actors.

Mitigating the Top 10 ML Risks

All these above tactics are similar in that they might cause a model to make incorrect determinations or act insecurely. So, how can you reduce risk when developing and deploying machine learning models? Here are some high-level prevention tips as recommended by OWASP:

Train the model on adversarial variations and include defensive mechanisms.
Ensure training data is carefully validated and secured.
Limit access to model predictions and encrypt the model’s code.
Implement strict access control to protect the machine learning model.
Verify package signature and use trusted repositories for third-party software.
Regularly monitor and update your datasets.
Verify the authenticity of feedback data.
Secure model interfaces with secure communication and encryption.

It’s also worth noting that, as with other areas of cybersecurity, many risks aren’t technical in nature but arise out of social engineering tactics, like bribery or threats. As such, insider threats should be seriously considered and a zero-trust model should be adopted for all development environments, including machine learning projects.

Safeguarding Machine Learning Projects

As we continue to invest more into AI, safeguarding machine learning projects will continue to become a high priority. Above, we’ve summarized the key concerns to remember when developing ML models. Cybersecurity professionals should consider reviewing each OWASP risk along with each’s detailed mitigation advice before proceeding with development and running ML in production.