6 Best Practices to Make the Most of Your Sandbox Proof of Concept

Any time you incorporate a major new component—such as a sandbox platform—into your security ecosystem, it’s important to do a rigorous, side-by-side evaluation of competing products to determine the best choice for your situation. But a proof of concept is about more than detection rates and vendor scores. It’s also a chance to get a head start on the successful deployment and use of whichever sandbox you eventually choose. Your team can:

  • Leverage expert guidance from the participating vendors to maximize the strengths and work around the shortcomings of their products.
  • Utilize best practices for malware detection and analysis that your staff can carry forward into the production environment.
  • Learn to correctly interpret detection results—and the detailed information underlying those results—to enhance the speed, efficiency and effectiveness of malware detection efforts.

Below, we offer six best practices to help ensure the success of your sandbox proof of concept.

Prioritize Staff Productivity and Operational Efficiency

When comparing product features and performance, keep in mind the needs of core users: your junior SOC analysts and expert escalation team. For SOC teams, which are often overwhelmed with incoming alerts, you want a tool that will minimize alert fatigue. Look for a sandbox that combines a high detection rate with minimal or no false positives (FPs). Both are important, but the FP side of the equation is often downplayed. (We will cover FPs in more detail later in this post).

In addition, your skilled forensics investigators, who handle incident response, need access to the underlying details of a suspected malware incident, but only what’s truly relevant to solving the issue at hand. Too much information or extraneous information increases alert fatigue and causes costly analyst time to be wasted.

Last, but not least, be sure your sandbox platform integrates easily and seamlessly with the existing tools and workflow in your security ecosystem.

In Testing, Use Malware Samples That are New and Unknown

The essence of all six best practices is that proof of concept testing should resemble production use as closely as possible. Otherwise, you will get misleading results. That means using malware samples that are unknown and therefore able to evade your organization’s security protections, as they would in a real-world intrusion.

It’s important to note here that almost all sandboxes have built-in AV engine detection. When you submit old samples to multiple sandboxes, it’s very likely all of them will flag the malware. However, if the AV engine is the only sandbox component that detects the sample during your proof of concept, this tells you the sandbox would NOT have caught the malware when it was new and previously unknown.

Obtaining unknown malware can be a challenge. Some SOC teams develop their own, but more convenient sources exist. Searching “virustotal.com” on Twitter quickly yields many current samples with low AV detection rates. (You can check if a sample is new by investigating the “First Submission” date under “Details.”)

Another option: If you have access to the VirusTotal Private API or VirusTotal Intelligence, you can query for new samples that have low AV detection rates. Search for samples that are less than a day old and have only 1-5 AV matches.

Verify If the Malware Sample’s Remote Infrastructure is Still Running

Most malware campaigns use a multistage infection process. In a common scenario, an initial loader file penetrates the network and connects to the command-and-control server, downloading and executing the next malware stage. However, because the server may be shut down within a few hours or a couple of days, using older malware samples for your proof of concept isn’t viable. You won’t be able to detonate and analyze the next stage of the malware’s behavior, as it would occur on the end hosts you want to protect. Instead, the only behavior you’ll see during runtime is a failed HTTP download, followed by process termination.

Check for False Positives by Submitting Benign Applications for Analysis

Ideally, you want a sandbox to correctly identify every malicious file. You also want to be sure the sandbox does not incorrectly classify benign files as being malicious (FPs). The ability to do both things determines the product’s accuracy in distinguishing between malicious, benign and ambiguous activity. That’s why an effective proof of concept will look at the detection rate and FPs, as well as the relationship between them.

FPs contribute to service delays and backlogs. Yet, sandbox proof of concepts tend to focus so heavily on detection that they pay little attention to the FP rate. Some vendors may tune their sandbox to maximize detection with no consideration for FPs. The total proof of concept score for such sandboxes can be favorably skewed if evaluators only submit known malicious files, which will always be correctly reported as malicious. In scenarios such as these, you may want to investigate further or be wary of that particular vendor’s product.

Test Sandboxes to See Whether They Create Significant Noise

Sandboxes provide detailed visibility into malware behavior, which is vital in forensics investigations, when time is of the essence. However, products significantly differ from each other in how much data they present and how concise and relevant the information is to the user’s needs.

Some sandboxes generate reports that include a high level of noise: extraneous and irrelevant details that analysts must wade through, thereby slowing down investigations and making high-value personnel less productive. Noise may be generated not only by malicious samples but by known benign samples, such as office productivity documents and related scripts.

One way to check for this flaw is to submit files that are known to exhibit little or no behavior. Examples include an empty Word or PDF document, or PowerShell commands that are known to be benign. If a sandbox nonetheless reports detected behavior, you can assume it was caused by a source other than the sample itself. Furthermore, the sandbox in question will likely produce irrelevant noise in other reports, which may be reason enough for you to avoid such products.

Think of Your Vendors as a ‘Team of Rivals’

During a proof of concept, you should view your vendors as a resource that can help accelerate your learning curve for malware detection and analysis. Ask them questions, ask for help and challenge them to substantiate the claims they make about their own products and their competitors. Proof of concept is also a great opportunity to test the quality of customer support and service in areas such as reaction times and flexibility. This approach not only will help clarify which platform you want to adopt, but also which team you want to work with in the long run.

Ralf Hund

Ralf Hund

Ralf Hund is the CTO and co-founder of VMRay. Ralf achieved his Ph.D. in computer science / IT-security at the Ruhr-University of Bochum in 2013. During his studies he focused on new analysis methods for software binaries, with a strong focus on malware. His findings were published and presented at many leading academic IT security conferences and received several awards for outstanding work. He also contributed to the popular academic malware analyzer Anubis during the course of his research. He has experience in malware research and software development for more than 10 years and is an active speaker at various academic and industrial conferences. His special interests lie in virtualization techniques and its application to software analysis.

ralf-hund has 1 posts and counting.See all posts by ralf-hund