
What a mesh — GNNs in cybersecurity
What a mesh — GNNs in cybersecurity
Why graph-based Deep Learning has not lived up to its promise
In this blog, I dive into a technology that has recently been among the most hyped, and most disappointing, approaches to anomaly detection — Graph Neural Networks, or GNNs. Their role in helping Lacework to raise >$1bn on a multi-billion valuation only to see that company sell for pennies on the dollar speaks to the allure of GNNs.
They certainly attracted me. I came to cyber security through GNNs. They just make so much sense — after all networks and systems are easy to understand as a set of nodes, or devices, and edges, or connections between those devices. Graph networks are prevalent and one of their sexy use cases is looking at networks. So graph neural networks are where I started, for example, diving in and taking Stanford’s CS224W.
It was only after a lot of investigation that I decided to go in a different direction for cybersecurity.
My decision started with a lot of conversations. When I chatted with security operators and CISOs, like our advisor Chris Bates who helped to build SentinelOne, I heard that improved accuracy in detecting issues is only part of what security operators care about. There are a number of other criteria that security operations care about, including explainability, the ability of a model to show value quickly, and of course efficiency of operations at scale. We also see an emerging demand for modular security, meaning the model needs to be able to run independent of specific agents or data feeds, on any security datalake.
The model and related software should do a single job very well — see attacks early in their life cycle and convey to the team in the SOC and detection engineering as much context as possible so they can determine criticality by examining blast radius and so forth.
GNNs excel at modeling relationships, however they are less proven at learning the importance of sequence and time. At some level my decision to guide us away from GNNs was based upon metadata — metadata such as Arvix published papers and references to these papers. I got hands-on with attempts to adapt GNNs to temporal data and found in their papers and implementations that it was still a relatively immature domain, especially as compared to approaches that incorporate sequence information into their models. Anyone who has lived in security knows that sequence information and time, absolute and relative time such as seconds between events, are crucially important. GNNs are weak here — which led me to think they’d be relatively poor at dealing with the noise of production environments.
Explainability is tricky. Neural networks inherently are not very explainable; they work in part because they learn relationships that humans didn’t anticipate and so that may not be intuitive. These relationships can be difficult to disentangle from the model. They do not follow any simple rule, or even a formula or a function.
GNNs would seem to have a leg up in explainability. After all, they have within them information about edges and nodes. Couldn’t we send this information along to the security operations teams to help them make determinations? Graph data structures, after all, are fundamental to the amazing success of Wiz.
There are at least a few reasons why, in practice, GNNs have a hard time providing useful context to the analyst. The most important for me is that, again, GNNs don’t learn sequences well. Their learning builds explanatory relationships that render the initial relationship of the nodes and edges moot.
GNNs aggregate information from neighboring nodes to update the representation of each node. This aggregation step, which typically involves summing or averaging features, leads to a loss of granularity, making it difficult to trace back specific patterns or relationships that contributed to a particular decision. We hear again and again that sequences matter in cyber security — and because of the way GNNs aggregate, I couldn’t see a straightforward way to preserve sequence information in GNNs — i.e. to point back to a set of actions that led the GNN to flag something as suspicious.
Another important challenge of GNNs is they have a hard time demonstrating value quickly. This is because they do not generalize well across environments. Once again, much like the incorporation of sequence and time information into GNNs, there are efforts to build GNNs as foundation models however these efforts are relatively recent.
Generalization refers to a model’s ability to apply what it has learned from one dataset or environment to new, unseen scenarios. Foundation models are pre-trained on vast amounts of diverse data and excel in generalization by leveraging transfer learning — the ability to adapt their knowledge across different tasks with minimal fine-tuning. This limitation hampers the effectiveness of GNNs in cybersecurity, where dynamic and novel threats require models capable of generalizing beyond rigid training data. Each environment is different.
The lack of generalization shows up in it taking weeks or months to tune a model for accuracy in a given environment; this slows down the time to value, and makes the model less likely to be able to keep up with the inevitable changes in an enterprise.
Moreover, GNNs are not inherently good at handling long-range dependencies across time or across other dimensions of the dataset; they would be poor at seeing that strange behavior by a device in Oregon could be related to a network flow in Europe, for example; they tend not to see such long-range dependencies. Similarly, they tend to forget or not learn relationships that are distant from each other in time.
I certainly don’t regret learning about GNNs. They are amongst the most promising deep learning technologies of our time. However — cyber security is different. Humans are in the loop and must stay in the loop. And the way that we configure and deploy our systems varies enormously. So explainability and the ability to adapt to new environments are all important.
In upcoming posts, I’ll discuss different approaches. I would welcome your feedback. We can all agree that the status quo in security is untenable; the industry is doing such a poor job that our way of life is at threat, despite ever-growing spending on cybersecurity. Approaches like GNNs, which show such promise that they helped Lacework raise billions, are also not yet proven for cybersecurity. So what could work? Or are we doomed to stay a step or two behind the attackers, laboring to update our rules and proprietary security stacks, and seeing ever more attacks skip past our reactive approaches?
What a mesh — GNNs in cybersecurity was originally published in DeepTempo on Medium, where people are continuing the conversation by highlighting and responding to this story.
*** This is a Security Bloggers Network syndicated blog from Stories by Evan Powell on Medium authored by Evan Powell. Read the original post at: https://medium.com/deeptempo/what-a-mesh-gnns-in-cybersecurity-40b08b279871?source=rss-36584a5b84a------2