How to Break Broken SOC Cycles
We’ve all heard the same buzzwords used to describe the current situation in security operation centers (SOCs). Among them are “alert fatigue,” “labor crunch,” the “skills gap,” “high turnover” and “missing advanced threats.” Based on my experience working in and with SOCs, I agree with these assessments. Here’s a (brief) overview why:
I’ve worked for an MSSP that at any given moment had 30,000 tickets in the queue, with the oldest one being weeks old. My colleagues and I have worked overtime “temporarily” to do more with less while we waited months for new hires to trickle in. I’ve interviewed and worked with analysts who did not have the experience and skills needed to do the job effectively, but were brought on due to desperation. I’ve been on awkward calls with upper management explaining why our toolset didn’t catch the latest threat until after it had already done some damage. I’ve experienced these problems firsthand and know all too well what the consequences are.
I’m about to reach the two-year mark at my current position, which will be the longest I’ve ever remained in a cybersecurity job. This role has provided me with context when looking back on the issues I’ve previously faced working in SOCs: All of these problems, while separate, are actually all part of the same self-perpetuating cycle.
If we keep this cycle up as an industry, we’ll only make it harder on ourselves in the future. The people who create the detection methods used by SOCs typically come from a background where they themselves used to process alerts. Those people typically become too overwhelmed and frustrated to really learn anything, so if they move on to become the next generation of threat hunters and detection creators, the cycle is likely going to repeat itself ad infinitum.
This situation creates another, tangential cycle which, again, just makes everything worse for everyone:
It’s Not About Quantity
Some of my followers (hi, Mom!) may remember comments I’ve made in the past about how analysts are typically measured by the amount of alerts they process rather than the number and quality of threats they uncover. Some of the consequences of that yardstick are that security professionals can often miss or fail to dig deep into what appears to be mundane/unwanted behavior that are lower on the maliciousness scale. That is precisely what many modern attackers rely on to fly under the radar.
There has been some work toward alleviating the steps in the above cycles; some by vendors, and some by SOC analysts out of necessity. These improvements usually come in the form of automation or event correlation, either taking the steps that an analyst would have previously done manually or tying together information so the analyst doesn’t have to.
In my experience, the problem with most of these solutions is that the automation and correlation has to account for many pieces of information from many different appliances, all of which have different formats and convey different meanings. To make matters worse, there’s the question of who creates these automation and correlation features. Are they actual security professionals, or are they people who interview us a handful of times, create what they think we want and then disappear behind a paywall, never to be budgeted again? Are we, the security professionals, able to create our own features using the methods allowed by the tools we use? If so, does that mean that we’ve gone from threat hunter to full-stack developer to tie everything together? Or perhaps we just have people who are SMEs on how all the tools work and what information they convey so we’ll sit in long, unproductive meetings postulating how the tool I understand is going to talk to the tool you understand.
Ultimately, an analyst who’s fed up waiting will roll out his own solution using a scripting language such as Python. While that’s certainly effective in many scenarios, how long do they have to spend debugging their Frankenstein’s monster, and *gasp* what happens when they inevitably leave?
You’re probably thinking that I’m just another cynical security professional who likes to point out everything that’s wrong with the industry. But there is something that can help break the issues cycle present in so many SOCs: functional programming (I bet that’s not what you expected to hear). Functional programming forces you to break apart your problem into smaller (reusable) pieces that are easier to read, debug and maintain. This is extremely important in a field where we simply do not have enough people to keep up with the demand.
When used for detecting cyberattacks and threats, for example, functional language dictates that detection will not even be accepted as valid unless the data you’re passing around from function to function are the correct type. The result? Better, higher fidelity and easily understood detections, which allow analysts to focus on general ideas of threats without having to know the ins and outs of so many (which is something that can be very daunting for a junior analyst).
Security platforms must support the ability for their users—aka the analysts—to build their own detection logic with functional building blocks rather than complex rules. By doing so, junior analysts will have fewer false positives mucking up their ticket queue, which gives them more time to actually learn on the job instead of running on the same treadmill with no change in scenery. They will become more skilled cybersecurity professionals and go on to become knowledgeable managers, threat hunters and detection creators who will continue to improve our current methods well beyond what they are now… and the cycle will continue, but in a good way.