Is NTA Just Another Kind of IDS?

by Gary Golomb on January 14, 2019

Earlier last year, Anton Chuvakin of Gartner posted a question I’ve spent the past few years focused on. Actually, I’ve focused on it since working in the Network Security Wizards office on the Dragon IDS back in Y2K, back when it was called Y2K.

In the post, Anton posits the question, “But can somebody please explain to me why NBA then and NTA now is not just another kind of network intrusion detection system?”

Earlier in the post, NTA is defined as (my emphasis added): “This Gartner term (NTA for ‘Network Traffic Analysis’) is essentially our view of the evolution of NBA (network behavior analysis) or NBAD of the olde times. IMHO, NTA was born to separate the old, mostly flow-based (layer-3) technology from the modern layer-7 based tech …”

The answer is simple, but to understand the difference between 1990s intrusion detection system (IDS) and current NTA solutions, it’s important to understand how the perimeter of the enterprise has changed, how network traffic has changed, how endpoints have changed and how attacker MOs have changed.

A Historical Perspective

Back in the ’90s, the perimeter of the enterprise was a clear demarcation between “inside/accountable” resources and everything else—or the “outside.” Now, everything-as-a-service has smattered the traditional guts of enterprises across the internet, while embedded systems and bring-your-own-devices (debris) have flooded the enterprise with a deluge of platforms that most security analysts know almost nothing about protecting (yet, they’re still charged with detecting abuses of those alien systems and responding accordingly).

Because of these changes, many people in the industry erroneously refer to the perimeter as having disappeared. In fact, the opposite is true. It has not disappeared, but rather has become more abstract. If you look at how we (as an industry) have managed the “perimeter” over the past 25 years or so, focus has shifted from the edge of the network, to server farms, to endpoints and, most recently, to identity. The perimeter hasn’t disappeared—we just have more of them now.

So, what does this mean for the traditional physical observation points (where enterprises usually have taps or spans) that have existed at the edge of geographical and physical network boundaries for the past 25 years?

The New Network Edge

In fact, those tapped/spanned observation points today are more valuable and needed than ever before. As offices fill with new types of unmanaged devices and more enterprise data flows to the internet, the edge of the network is filling a role that it never has before. The edge is becoming, in some cases, the only point available to analyze the intersection between enterprise business processes and the internet. Machine learning (ML) allows us to identify devices that engage in similar behaviors (as workgroups or departments of people do), identify the behaviors that are normal for them but abnormal for the rest of the enterprise, then watch for notable deviations in those behaviors. (I’ll give a real-world example of this below.)

These new technical feats are born of necessity. In the ’90s, there was an enormous cornucopia of protocols at the network’s edge. Protocols that are now extinct were commonplace there just 10 to 15 years ago. However, it’s important to note: Only the protocols have disappeared, not the activity they encapsulated. Rather, nowadays, most activity is shuttled across the physical network’s edge by one of a small number of protocols.

Additionally, as everything-as-a-service has turned the enterprise inside out, we now commonly see information at the edge of the network that we didn’t see in the past, such as identity in the form of LDAP and Kerberos or embedded in cookies, protocol handshakes and other unexpected locations.

The changes in protocol makeup and the changes in “edge topology” have dramatically changed what is/isn’t possible for attackers and has forced analysis methodologies to change as well. Signatures used to be my favorite detection methodology since they’re rapid to create, easy to understand and very fast to improve. Unfortunately for all of us, they’re not well-aligned to the type of network behaviors enterprises need to discover and respond to now.

It used to be true that you could identify the highest-risk activities by looking for sequences of bytes in packets, but the shift to tunneled services has changed that. Now, rather than identifying interesting sequences of bytes (signatures), edge-based detection methodologies must rely on sequences of activities over time (time-based analysis) or sequences of activities relative to other devices within the network (similarity-based analysis).

In the Wild

Consider the following example of similarity-based analysis: We recently deployed the Awake Security NTA platform in a network and quickly surfaced a handful of VoIP phones that had rogue call-recording software installed on some of the executives’ phones. This detection could not have been achieved with signatures (the deviant traffic was encrypted), endpoint agents (because there are none for the thousands of VoIP phones in most enterprises) or even baselining ML (as, unfortunately, most other “NTA” solutions are implemented today) because the rogue software was installed on the phones before the network analysis solution was deployed. The reason Awake’s analytics were able to automatically identify the problem in a population of thousands of devices that were extremely similar to each other is because there were a small number of those that persistently deviated from the other most similar devices.

This example illustrates many of the new realities I described above, but it also answers Anton’s question, “… why is NTA now not just another kind of network intrusion detection system?”

The answer is that NTA is actually “network intrusion detection.” In fact (and I’ve said this elsewhere), signatures, heuristics, anomaly detection, behavioral analysis, ML/AI and “forensic detection” are just different techniques that must be applied concurrently and in unison to separate modern threat artifacts from the deluge of non-threat artifacts concurrently tunneled in the now-smaller number of homogenized protocols. Each of those methodologies are not different markets. Each compensates for the weaknesses of another, while also saddling its own weaknesses that need to be compensated for.

The Categorization Fallacy

“But wait!” you might cry, “Different categories are required because some network solutions do wildly different functions, from detection to forensics!”

In fact, Anton has a follow-up post asking a seemingly separate question, “But just like EDR is a detection and response technology for the endpoint, why can’t we have the equivalent for the network? Why can’t we have NDR? Specifically, why can’t we have one tool that does signature-based NIDS, machine learning – based traffic analytics together with capture and retention of layer 7 metadata (and files and occasionally full pcap) for incident response support?”

In my opinion, this second question is very much the same question as the first. It used to be the case that one product would do detection while another product was required for investigation. This was a complete failure on the part of industry to properly support customers. If a product informs the customer of something they need to look into but then doesn’t provide the information needed to look into that thing, that’s a failure of the product to support the most basic and obvious workflow its user requires.

This has changed, and nowhere more profoundly than in the EDR space, where single products do both detection and investigation. And there is a very important distinction about technologies with such capabilities. Most products that only do detection only keep the data points the detection engine analyzed. Worse, they’ll often only keep the results of the analysis (the alerts) and nothing else. However, investigation workflows require contextual data, even if the detection engine doesn’t analyze it. This is required because there is always a human somewhere in the workflow that needs to answer the question, “What do we need to know that the system is not telling us?” That means the definition of [E/N]DR should not be based on methodologies—as the question and previous diatribe refer to—but rather on the data stored and the workflows supported.

Furthermore, solutions that vendors create must allow people to do both investigation and automation around the discoveries made by their tools. Not providing solutions that are focused on end-to-end workflows (from discovery to decision management) creates a workflow hell for enterprises.

Just as we wouldn’t accept a simple thumbs up or down from a doctor after a checkup, we shouldn’t make security analysts rely on a diagnosis made in a black box. And, sticking with that metaphor, diagnostic tools that perform multiple functions may fall into more than one category by definition—but that matters much less than the end patient outcomes. As tools evolve and improve, we need to have more flexibility in how we’re grouping and using them.

— Gary Golomb