Home » Security Bloggers Network » Apply Integrated Reasoning to Incident Management

Apply Integrated Reasoning to Incident Management

by Tim Wenzlau on August 4, 2020

In the final part of this two-part series, we introduce the concept of reasoning and the role of influences, or data sources, in how integrated reasoning is applied to cybersecurity. Missed part one? Check out, An Introduction to Integrated Reasoning, available here [link].

What is happening?

Characterizing what is happening in an attack requires strong foundational cybersecurity domain expertise as well as insight into the internal assets being targeted. Frameworks like Mitre ATT&CK are helpful, but not all security products have aligned their reporting, requiring analysts to still interpret. The observed techniques, tools, and procedures (TTP) can provide insight into the attacker’s sophistication – does this appear to be commodity malware or an adversary with hands on the keyboard? Is the attack relevant? A false-positive can be identified and discarded if an alert is not relevant to the target, e.g. a Linux based exploit against a windows machine. Was the attack successful? In many situations, another security control might have prevented access, a port was closed, a file or a connection was blocked – all which require interpretation of other ‘integrated’ data sources.

Pattern analysis is an invaluable tool to understand if an observed activity is normal and authorized versus anomalous and potentially malicious. How widespread or common is the observed activity? Does it appear to be administrative work? Does the user or system have a history of performing the observed actions or a history of repeated infections? Have these systems communicated in the past?

Evaluating time series data is equally important as attacks are infrequently a single event. To triage and gain full situational awareness, analysts must identify what happened before and after. To highlight an example where it is important to be thorough, an endpoint protection agent will generate an alert when malware is detected and removed from a system. But how did the malware get there? Did the endpoint protection agent clean everything? Likely not, given that endpoint protection platforms only identify 57% of potential malware infections, a trend partially influenced by the increase in fileless malware.

An analyst must evaluate for additional signs of compromise to confirm that the infection was remediated. For example, continued signs of infection can be identified through continued observation of suspicious web beaconing behavior identified in web filtering logs, suspicious process activity identified in endpoint detection and response solutions, or lateral exploits against other internal systems identified via east-west facing network IDS/IPS devices – all of which the Respond Analyst performs through integrated reasoning.

Finally, what is the severity of the incident?

If at the end of their analysis, uncertainty cast aside, an analyst decides that the observed activity is, in fact, malicious and actionable – they must assign severity to initiate incident response procedures. Severities generally align with SLAs in response times, which could range from minutes to days depending upon the organization.

One important factor in assigning an incident severity is scope. How many systems are involved in the attack? Is this a single malware infection or a larger outbreak? How many systems have been exploited? To understand the scope, the analysts must constantly be aware of the relationships between threats observed within their environment. Unfortunately, maintaining relationships between thousands of entities in time is not a task that humans are consistently capable of performing well. Also, analysts work in shifts and are therefore blind to what happened when they were off. For example, Analyst 1 may be overwhelmed and ignore early-stage initial access activity on a system – in favor of investigating more severe alerts. Analyst 2 comes on shift, observes lateral movement from that same system, but writes off the alert as a false positive or administrative activity because Analyst 2 was unaware of the earlier signs of exploitation that Analyst 1 ignored.

The scope is not the only factor in assigning a severity. The Respond Analyst dynamically assigns a severity of one through four to each incident based on the following four factors derived from the events and systems currently scoped into the incident:

Most progressed attack stage
Number of internal systems involved
Highest asset criticality of the involved systems
Likelihood of the activity being malicious and actionable

Although I outlined the questions a security analyst asks to make a decision in series, both humans and the Respond Analyst ask these questions in parallel, simultaneously. Decision making is not a deterministic or static process, as our uncertainty changes as more influences on that decision are revealed in time.

I talked above about how a security analyst generally makes decisions, but I did not talk much about the different types of analysis and decision making analysts employ. The triage and decision-making procedures vary based on use cases, attack tactics, and telemetries involved.

Therefore, due to the complexity of the decision and the specific ways that certain influences can have on the uncertainty that a system or account is compromised – we built the Respond Analyst as an expert system, that comes out of the box seeded with the questions an analyst needs to ask to make a specific decision as well and the integrations and processing capability to answer those questions and reduce uncertainty.

Summary

Human beings and machines alike can use reasoning to make decisions. However, when comparing the capabilities of humans that can only look at a handful of factors versus machines that process thousands of them, it’s easy to see which option will produce a better decision-making outcome in most situations. Additionally, humans rely on experience and personal bias that will impact their decisions, while machines do not have these limitations and can use advanced probability instead.

Today, many organizations recognize this disparity and are trying to solve the problem with traditional solutions like SIEM and SOAR systems that rely on playbooks that make binary, true or false decisions. This can be problematic when considering the complexity and tremendous volume of data that environments produce. SIEM and SOAR technologies are simply not designed to evaluate this volume, not to mention they can take months or years to implement. And as companies expand, the complexities of keeping networks and endpoints secure grows as well. The only way to keep up with this growth is to incorporate software and machines that can scale appropriately.

To address this, Respond Software offers the Respond Analyst, a software-based solution that reasons through complex decisions to infer which systems are affected in an attack, the criticality of those systems, and prioritize incidents for remediation. Get a peek at how it works. Join our live demo on August 20 and talk to our security operations experts. Register today! In the meantime, we put together a variety of resources that showcase how we can apply decision-making to your cybersecurity program.