What Security Pros Need to Know About Machine Learning in the Year Ahead

If you are to believe the marketing hype, artificial intelligence (AI) is a superhuman security entity that far supersedes your puny, mortal skills. But that’s just swill for the bean counters who likely wish they could dump payroll as a line item from their budget spreadsheet. It’s certainly not the stuff from which security legends are built.


Besides, AI isn’t really a thing yet. But that’s not to say that its sidekick, machine learning, won’t help you kick some serious black-hat butt next year.


Winning or losing with machine learning comes down to one thing—the same thing that has always won or lost in security: the effectiveness of your defensive strategy. Machine learning is just another tool. How it’s used is entirely up to you.


Every defensive strategy relies on a series of tactics aimed at defending against the expected assaults and designed to leverage your weapons quickly in the face of the unexpected. To that end, here are machine learning tactics you should consider for next year.


Make machine learning over in your own image. The algorithm has to be trained. It learns whatever you expose it to. To make machine learning the best Robin to your Batman, teach it what you know. Gather the data that will expose the machine learning software to both common threats and known vulnerabilities. For it to have enough to learn from, you’ll need to feed it a lot of data—more data than you might have on hand, actually. At least as much as you have in your head from years of experience. But since we’ve yet to come up with a way for you to plug your wetware directly into the software for an information/experience transfer, you might want to take a good hard look at the next two tactics on this list.


Skip building from scratch; borrow from others. Here’s the secret: You can train the software to do as you do without having to do all the training yourself. Just like we send children to school to learn all sorts of general stuff and teach them specific skills and behaviors at home, so, too, can software be trained by multiple sources.


If you have sufficient amounts of data, high-performance computing (HPC) access and the algorithm/modeling skills to go it alone, then by all means, charge forth. Suggested tools for you to consider include Microsoft’s Batch AI and Cray’s supercomputing-as-a-service. There are more from-scratch tools on the market and more still that are coming soon. These particular suggestions are only to point you in the right direction.


But, don’t forget to tap other resources for help in training so you don’t have to start from nothing and at maximum expense. Security product vendors, for example, are likely to have data you can use (usually for a price, of course) to train ML on common and known threat vectors. Or, they may have or will soon offer ML products that are already trained to some degree. Ask them about it.


Create an industrywide ML team. Training is a serious undertaking. One must be incredibly careful with the lessons taught and the data that is exposed to ensure that the ML doesn’t learn bad habits or is manipulated into developing a vulnerability for hackers to exploit. It makes sense, then, for the entire security community to collaborate on ML basic training, i.e., providing data and developing models together. Once your ML passes basic training, your team can train it on proprietary information and techniques. That can be done by exposing it to additional data and models you’ve prepared for that purpose, or it can learn on the job as it performs its daily tasks.