Posted under: Research and Analysis
Let’s resume our discussion of endpoint attack prevention approaches with the options available once an attack actually begins to execute, or once it has already executed on a device.
During Execution (Runtime)
Once malicious code begin to execute, prevention of compromise requires recognizing bad behavior and blocking it before the attack can take control of the device. The first decision point is whether you want the protection to run in user mode (within the operating system and leveraging operating system protections) or kernel mode (at a lower level on the device, with access to everything – including interactions between the kernel and CPU). Many attacks exploit the operating system and applications which run within the OS, so it’s reasonable to protect in user mode.
But you cannot preclude adversaries from attacking the kernel directly, so as so often, the best answer is often both. You need OS and application specific protections, but to comprehensively protect devices you need to monitor and protect the kernel as well. Otherwise you cannot defend against privileged processes and kernel-level rootkits.
Exploit prevention: This is a large bucket of many techniques, designed to prevent exploits from compromising devices. Many advanced endpoint products use most (or even all) these techniques, and due to constant innovation by attackers they add new preventions on an ongoing basis. So understand this is a dynamic list.
Exploit pathway blocking: This approach is driven by threat research, profiling behaviors observed when malware compromises devices and watching for those patterns in real time. It turns out there are a couple dozen ways to gain control of a machine (of course the actual number is up for debate), and if you make sure none of those patterns scenarios can be completed on a device you have a high level of protection. But be careful to monitor both false positives and resource consumption, because evaluating every function at the kernel level can have unintended consequences, starting with the predictable performance drain. This is a similar approach to HIPS (Host Intrusion Prevention), but detection is focused on device compromise at a much deeper device level.
Memory protection: To detect the memory attacks described in our previous post (file-less malware), the memory usage of the operating system and applications need to be profiled; and memory must be monitored for abnormal memory activity which could indicate memory injection, encrypted memory, or hidden modules. Once again, this has driven an emphasis on endpoint threat research because profiling memory usage requires deep understanding of endpoint operating systems and how attackers manipulate devices.
Macro protection: To protect against rogue macros, advanced endpoint prevention requires the ability to block unauthorized and potentially malicious macros. Similar to exploit pathway blocking and memory protection, threat research profiles legitimate macro behavior and malicious macros to develop a model for what macros can and should do. Anything that doesn’t fit into this model is blocked. Once again, this technique highlights the importance of threat research to ensure profiles are accurate and current.
Script protection: The key to protecting against rogue scripts is to ensure that the logical chain of events makes sense. For instance a browser probably shouldn’t be launching a PowerShell script to execute command-line actions. If a device sees that behavior, block it. Likewise, a profile of legitimate scripting activity can be developed to detect and protect against malicious scripts.
Registry protection: To maintain persistence adversaries increasingly store malware within the device registry. To prevent these attacks the registry needs to be profiled and monitored to prevent unauthorized changes, and if necessary to roll back undesired changes.
Privilege escalation: At some point during an endpoint attack, the adversary will need to elevate privileges on the device to run the malware. The advanced endpoint agent can look for privilege escalation and new account creation as strong indicators of device compromise.
Pros: You cannot really stop advanced exploits without protecting devices against these techniques, so it’s not really a question of whether to include these features or not. It’s about understanding how a vendor develops the models they use to distinguish legitimate behavior from illegitimate.
Cons: These preventions require models of appropriate behavior, so false positives are always a concern, which comes down to opportunity cost. Whenever you need to spend time chasing down things that aren’t real issues, you aren’t doing something more useful. Ensuring that any agent provides granularity in terms of what gets blocked versus generating an alert is absolutely critical. Be aware of application impersonation, where a malicious application spoofs a legitimate one to access its privileges. Also consider differences between operating systems, in terms of ability to detect kernel activity or privilege escalation.
Isolation: Another common technique is isolation within the operating system to shield critical system resources (such as memory, storage, and networking) from direct access by executables running on the system. This abstraction layer between applications and system services enables monitoring of system calls and blocking of abnormal behavior.
Pros: Isolation is a time-honored approach to making sure a problem in one area of the environment doesn’t spread anywhere else. Abstracting operating system services and blocking malicious behavior before it can spread provides resilience to the device and prevents full compromise.
Cons: Isolation of operating system functions is very complicated and resource-intensive on the device. This approach requires high-powered devices and considerable testing before rollout, to ensure it doesn’t break applications and impair employee productivity.
Endpoint sandbox/emulation: A few years ago network-based malware sandboxes were all the rage. Files coming across ingress networks could be analyzed and unrecognized files would be executed inside the sandbox to see what it did. If a file showed malware characteristics it would be blocked at the perimeter. These devices worked great… until malware writers figured out how to evade them, at which point effectiveness took a hit, although there is still value in this approach and some prevention products detonate any unknown files in a sandbox on the endpoint to look for malicious characteristics. We’ll discuss this in more detail below, including integration with network and cloud based sandboxes.
Performance: Network sandboxes were plagued by latency and delay waiting for a verdict, so organizations were forced to let files pass through during analysis to avoid unacceptable performance; they later had to chase down files which turned out to be malicious. There is nowhere else to go on an endpoint, so performance of the emulation environment is critical to avoid impairing the end-user experience. The analysis should be imperceptible to users.
Evasion: Similar to network-based sandboxes, endpoint sandboxes are part of a cat-and-mouse game, as malware writers strive to evade emulation and other prevention techniques. Look for both bare metal and virtual emulation capabilities, intended to fool malware writers looking for indicators of execution in a sandbox. Ensuring that the emulator cannot be detected by malware is critical.
Integration with network/cloud sandboxes: For performance purposes, it may not make sense to detonate a file locally, so the ability to send it to a network or cloud sandbox provides additional flexibility in dynamic analysis of potential threats.
Pros: If malicious code can be be detected by execution in a walled garden, then the device can be protected and without recourse to the other exploit protection techniques above.
Cons: Performance, evasion, and resource consumption are all concerns for emulating on endpoint devices.
Network traffic monitoring: During an attack malicious code needs to connect to a command and control network to receive instructions and download additional malware. Monitoring network traffic to and from endpoint can identify attacks in progress, whether from C&C traffic, lateral movement, or known indications of internal reconnaissance for other vulnerable devices on the network. Recognizably bad traffic can be detected and blocked before additional damage is done. And a list of known malicious networks enables endpoint protection to block traffic to those networks – there isn’t much legitimate call for devices to communicate with botnets.
Pros: If all else fails, malware still needs to communicate with a network at some point, and you can detect and block that traffic. In this case the device is already compromised, but if it can’t connect to its botnet it can’t download additional malware and become operational.
Cons: Blocking legitimate network traffic is highly problematic. So the accuracy of the malicious network list is critical.
Once the malware has executed and the device is compromised, it’s a matter of containing damage and cleaning up the mess. This isn’t prevention, but post-execution activities are an important step between the prevention and response processes, so expect some of these capabilities in any advanced endpoint protection offering.
Given the sophistication of advanced malware, the most reliable means of cleaning a device was to reimage it and start over again. This was expensive and made many employees grumpy because they lost work and productivity, but necessary because it was difficult (or outright impossible) to be confident you had really cleaned everything off the machine. Now, given the visibility and granularity provided by advanced endpoint security agents, it has once more become realistic to actually clean devices.
Containment: Once a device has been compromised, the first step is containment. Make sure the device can’t reach anything valuable, and that you trigger your response process. There are many situations where a device is compromised without the pre-execution and during-execution controls working. In such a case you get an alert, and you suspect a device has already been compromised.
Quarantine: You’ll need to move the device off the corporate network to a place where it can be monitored to track subsequent activity and/or isolated so it can’t hurt anything else. This is about containing the attack and the specific device, and typically requires some type of integration with network infrastructure to move the device to a protected and monitored network.
Validate attack: You’ll want to figure out what happened on the device. This typically involves some kind of visualization of activity on the endpoint. Some endpoint protection agents include a ‘light’ version of a full Endpoint Detection/Response (EDR) capability to provide enable at least a cursory device investigation.
Retrospective search: Once it’s clear what the attack was, and what malicious code it used to compromise the device (from the validation step), you can search the other endpoints in your environment for the same activity. Identifying additional compromised devices in your environment enables you to contain them on the network and protect the environment from a broader attack (such as SQL*Slammer, for you security historians). This tends to be a feature of a broader EDR offering (which will discuss later in this Buyer’s Guide), but again a ‘light’ version of this search capability may be bundled with a prevention offering.
Pros: The deeper into containment an advanced endpoint prevention offering goes, the more quickly you can contain damage and restore your environment to a clean and healthy state. This is where integration between prevention and detection/response starts to come in handy, providing all the telemetry and analysis responders want.
Cons: There is a measure of necessary complexity to EDR – even relatively simple EDR. Tools which integrate this capability run the risk of overcomplicating usage for less sophisticated practitioners. And in larger organizations a different group tends to handle incident response, so make sure any information uncovered during validation can be sent to and leveraged by whatever EDR tool incident responders use.
Remediation: If a device is compromised, it needs to be cleaned completely to ensure reinfection doesn’t occur. There isn’t much more to say about that.
Automation: The question here is whether to have the endpoint prevention system clean the device automatically, or to alert the administrator and trigger an incident response process. This depends on the nature of the attack and the sensitivity of data on the device. Although given how quickly an attack can spread within an organization it is better to move faster into remediation instead of waiting for a perfect solution, if you can make that work. In any event it is important that you can configure scenarios for automated remediation in case you want to go down that path, now or later.
Granularity: Some malware burrows into a device very deeply (think rootkit), so make sure the remediation function can clean up whatever changes the malware made to the device. This includes removing any injected code and registry settings which could provide persistence for an attacker. Whether cleaning or reimaging is automatic or manual, cleaning must be complete.
Pros: The ability for an endpoint prevention product to clean up a compromised device can substantially improve response time and return device to normal operation sooner, which is critical given the business impact of compromised endpoints.
Cons: Remediating prior to full investigation can wipe out key evidence and telemetry which could provide clues as to the adversary, mission, and tactics. More mature security teams tend to use different tools for response, forensics, and remediation, to maintain chain of custody and ensure ful eradication of the adversary. Any tool you use for remediation should to work with your response and forensics tools and processes.
Now that we are through the prevention approaches, we are ready to examine the foundational technologies which enable all this shiny new prevention stuff.
This is a Security Bloggers Network syndicated blog post authored by firstname.lastname@example.org (Securosis). Read the original post at: Securosis Blog