3 Tips in Training Machine Learning for Security Work

If you’ve been evaluating new security tools, you’ve undoubtedly heard machine learning (ML) touted many times. It is fast becoming the backbone of all modern software, security systems included. Thus, it appears that resistance is futile, as some version of Skynet is likely inevitable—although which version ultimately manifests depends entirely on how it was trained. But that’s the future and we are in the Land of the Here and Now.

In the interest of protecting our charges, be they companies or individuals or mankind at large, let’s focus on dutifully and correctly training the machine. On that note, here are three tips to training ML responsibly now.

Garbage in, monster out. The old adage “garbage in, garbage out” in computer programming still applies, but it’s magnified in machine learning. The quality of the data used in training is so vital that “garbage in, monster out” is a likely outcome. Consider, for example, that exposure to Twitter taught Microsoft’s AI to be racist. It could have been worse; Twitter could have taught it to be a terrorist. In either case, it’s hardly the makings of a great security AI system.

Pay a lot of attention to the quality of data used in ML training. Whether your company or a third-party does the actual training, check and recheck data quality. If you teach it the wrong thing accidentally, you’ll end up with a monster you may or may not be able to control.

Teach only what is known, but teach it continuously. Make sure the training data contains lots of examples of known attacks. Why? Because the machine will only learn what it is taught, and it will also only work within the parameters of what it learned. Make sure everything it needs to learn is in that initial training data set.

“It’s important to understand that machine learning is not a solution by itself. Machine learning is only as good as the data you put into it. If your current security solutions don’t capture bad activity, regardless of whether they can detect bad behavior or not, machine learning also won’t detect it,” said Terry Ray, CTO at Imperva.

That also means that machine learning is terrible at predicting new attacks. Predictive analytics can predict known types of attacks based on early activity that humans may miss, but it cannot predict the rise of unknown attacks. This means that training the machine will always remain an ongoing exercise to keep it up to date with known attacks.

Testing ML drives the teacher mad. Understand that no matter how much you or your team knows about app development and testing, precious little of that applies to machine learning. Testing machine learning code is maddening because you can’t assume two runs will produce the same output, ever.

You read that right. It’s damn near impossible to get the same output from two identical runs. That makes testing maddening. It’s called the machine learning reproducibility crisis.

“It’s hard to explain to people who haven’t worked with machine learning, but we’re still back in the dark ages when it comes to tracking changes and rebuilding models from scratch. It’s so bad it sometimes feels like stepping back in time to when we coded without source control,” wrote Pete Warden in his blog. Warden is CTO of Jetpac, which was bought by Google in 2014, as well as an Apple alumnus. He’s also on the TensorFlow team at Google doing deep learning.

Things get dicier when anyone make changes to the machine learning code or the training data.

“If they can’t get the same accuracy that the original authors did, how can they tell if their new approach is an improvement? It’s also clearly concerning to rely on models in production systems if you don’t have a way of rebuilding them to cope with changed requirements or platforms. At that point your model moves from being a high-interest credit card of technical debt to something more like what a loan-shark offers,” Warden wrote.

“It’s also stifling for research experimentation; since making changes to code or training data can be hard to roll back it’s a lot more risky to try different variations, just like coding without source control raises the cost of experimenting with changes,” he noted.

Be hyperaware of this problem going in so you can plan testing and future changes accordingly.

“I know of other teams who are serious about using models in production who put similar amounts of time and effort into ensuring their training can be reproduced, but the problem is that it’s still a very manual process. There’s no equivalent to source control or even agreed best-practices about how to archive a training process so that it can be successfully re-run in the future,” Warden warned.

There are some vendors who are beginning to tackle this prickly problem, such as Accenture’s recently debuted “Teach and Test” methodology: “AI system outputs are compared to key performance indicators, and assessed for whether the system can explain how a decision or outcome was determined.” Only time and testing will tell how well this methodology works.

The point is that testing is a difficult problem and you need to plan ahead for how you’re going to handle it.

Pam Baker

Secure Guardrails