The global surge in ransomware and zero-day malware has ushered in an explosion of innovation in the area of machine learning as a cybersecurity defense strategy. Often used interchangeably with “analytics” and “artificial intelligence (AI),” the term “machine learning” has also engendered a great deal of confusion.
In essence, machine learning and analytics transform data into insights that enable better decision-making. In the cybersecurity realm, these math-based processes collect and interpret security event data in various formats from multiple sources with the aim of identifying specific threat characteristics. Ultimately, machine learning can be summed up in two words: statistical comparison. Vast amounts of data are churned by computers using algorithms that compares features. The results of machine learning-based statistical comparison help security analysts decide whether something is malicious or benign, determine whether it has the potential to cause harm, and, if so, take appropriate action.
To clarify exactly how machine learning can facilitate cybersecurity decision-making, let’s compare autonomous, or self-driving, cars to malware. Imagine a world where these vehicles are commonplace and could be misappropriated by criminals to do nefarious deeds, such as kidnap people, drop off bombs or engage in theft operations. For example, what if an Uber request was intercepted by a bad actor who sent out a programmable, driverless car to pick up a passenger and kept the passenger locked in the vehicle? Let’s add that their mobile phone couldn’t communicate out, so they were really trapped. How would you track these cars down and keep them from doing harm?
In our fictitious sphere, we’ll say that all robot cars are manufactured at the same plant, just as most malware uses the same or similar toolkits. One of the ways to identify and catch these cars would be to look at static characteristics—things that criminals are not likely to alter every time they initiate an attack. These attributes can be determined when the car is parked—things like the year, make, model, configuration, where and when it was built, and GPS settings.
Looking at static characteristics won’t allow us to catch all malicious cars, however. We can gain additional insights from how the programmed car behaves when it’s in operation. These behaviors may include the destination, the route it took, whether it picked up or dropped off passengers or packages, whether it received or sent radio messages, whether it used radar or a police scanner to evade detection and more.
Getting back to how malware is much like our autonomous car, let’s look at behaviors that are at the root of today’s stealthiest attacks and pervade the threat landscape:
- Zero-day, file-based malware can hide using obfuscation, much like criminals can alter license plates or VIN numbers on robot cars.
- Malware can be sandbox-aware, behaving like cars that use radar or police scanners.
- Malware can evade detection by piggybacking on a known clean application, similar to an autonomous car that hides under a harmless, unsuspecting semi truck.
- Malware can bypass traditional security detection tools by misusing a legitimate application, which is like reprogramming a benign car to do bad things.
Machine learning can be used effectively to unmask these malware attacks, and even limit their impact. It can go a along way toward helping to separate the good from the bad. A swift-acting, signature-less machine-learning solution can consistently accomplish this if it has the ability to:
- Look at things that do not change.
- Analyze both static and dynamic behavioral features.
- Reveal evasion techniques before or while the malware is on the targeted device.
- Curb or eliminate damage to endpoints, saving “patient zero,” or the first machine under attack.
Machine learning can be of great value in our constant struggle to keep pace with rapidly evolving threats. Our autonomous car analogy clearly illustrates that there’s a difference between “things you can figure out when the car isn’t running” and “things you can figure out when the car is running.” When it comes to catching malware—or our hypothetical, malicious robot cars—you really need both static detection and dynamic detection. One without the other can only steer you down the wrong path, and, in today’s accelerated threat landscape, no one can afford to take unnecessary detours or end up in a blind alley.