SBN

JavaScript Deobfuscation: Hiding Intent to Fortify Bot Defenses

In the adversarial game of bot detection and mitigation, obfuscation plays a key role in delivering long-term efficacy. While client-side scripts containing highly sensitive detection methods are a necessary component for modern bot mitigation defense platforms – these scripts are delivered and executed inside the attacker’s environment – making them easy targets to reverse engineer as a means of bypassing the bot defenses they’re up against.

We’ve previously shared examples as to how attackers and bot building communities use open-source and off-the-shelf tools to quickly reverse engineer the obfuscation techniques used by most bot mitigation solutions.

In this blog, we’ll further explore these obfuscation techniques to give you deeper insight into what it is like to read code that has been obfuscated and how it could impact an attacker’s ability to decipher sensitive information as part of knowing what you’re up against and ensuring your bot defenses are resilient to such methods.

The Prevalence of JavaScript

JavaScript is everywhere. It powers almost every front-end application, and it enables incredible frameworks, such as React, and single page apps. With this level of prevalence, it is almost impossible to use the Internet without allowing JavaScript to run inside your browser. With more sensitive actions and functions being given to the client-side JavaScript, there is more pressure than ever to obscure what actions any given script may be performing. Failing to do so compromises many aspects of application security such as passwords that are encrypted on the client-side prior to sending, and in our case, the sensors that are designed to detect the presence of malicious automation.

JavaScript Obfuscation to Hide True Intent

The goal of any code obfuscation is to hide the true intent and functionality of that piece of code. In the case of JavaScript obfuscation, someone may attempt to do the following to protect their code:

  • Modify the names of variables. e.g.
    • var passwordVariable = 'password'; might become var xyz789 = 'password';
  • Modify the location of different functions to make it difficult to read the code in order. e.g.
    • Step 1 code => Step 2 code => Step 3 code might become Step 2 code => Step 3 code => Step 1 code
  • Encode variable strings to make them unreadable a person. e.g.
    • Using the native Base64 encoding / decoding functions, "hello”; would become atob("aGVsbG8=");

A common tool for JavaScript obfuscation is called obfuscator.io, which uses many modern techniques for obfuscating JavaScript (JS). This tool (or variants of it) is used by a large number of organizations looking to protect client-side JS, as well as fraudsters looking to hide their code’s true intent.

What does obfuscated JavaScript look like?

First things first: it’s important to tell the difference between regular JavaScript, minified (optimized / compressed) JavaScript, and obfuscated JavaScript. 

Below we have the same code, designed to add and subtract numbers, in three different states: Regular, Minified, and Obfuscated.

1. Regular

This is the unmodified code that has been written to be readable by others.

2. Minified

Minified code is not designed to prevent someone from reading the code, but rather to reduce the size of the code being sent to a browser. As you can see, some of the variables have been renamed, and some spaces have been removed. In the end, you can still read the code and figure out what it is doing.

3. Obfuscated

Obfuscated code is intended to make it hard to read the code and understand what it is doing. In this example, we have renamed all the variables with random looking names. Now it is much harder to immediately understand what the code is doing.

There are many other techniques for obfuscating JavaScript, but the basic premise is the same. However, these techniques are not all equal in terms of how hard they make it to reverse engineer.

Deobfuscating to understand the intent of JavaScript

To understand how this might manifest in the real world, we’ll use a mock application with some obfuscated JavaScript that we put together. We’re going to be analyzing some suspicious looking JavaScript that is being loaded into the HTML of a login page. By accessing the HTML of the DOM, we can see some suspicious looking JavaScript that is being executed.

After switching across to the Network tab, a fetch request is sent to a suspicious-looking endpoint. By hovering over the initiator, we can see that line: 8 of the main page initiated the request, matching our obfuscated JavaScript.

By looking at the HTTP POST body, it’s clear that some sort of JSON string is encoded with Base64.

Raw Request Payload

To decode this payload, all we need to do is run atob on the string, which gives us the following JSON:

JSON Payload

Now that we can confirm that local browser information is being collected, it’s time to dig into the JavaScript itself.

Formatting

The first step to understanding the intent of this obfuscated JavaScript is to format it, or “beautify” it. This can be done in a number of ways, including using the beautifier tool or a local code editor. The resulting formatted code from the process that we followed is:

Formatted script

Decoding the strings

Skimming through this code, there is a reference to the function _0x1f33 followed by encoded text at multiple points. This would indicate that the _0x1f33 function is used to decode strings before execution. First, we’ll rename this function to a more recognizable name, such as stringDecoder. Now, if we attach the stringDecoder to the Global window we’ll be able to access the functionality and data stored inside the scope of the function.

Decoder

Looking at the code of the function, we can see the internally scoped variables ['oiRxum','iCBZot','yuLUVi']

Accessing the yuLUVI variable gives us all of the decoded strings currently inside that function.

Body

Looking at the parameters used when calling the _0x1f33, it is usually called with a hex encoded number 0x1,0x2,0x3... etc., so from here, it becomes easy to replace each 0x1f33 call with the respective string it would return.

Renaming the variables

From here, we need to re-name functions that have an intent that we can understand. We can do this by starting with smaller, simple functions that are easy to understand, then working your way backwards. In this case, it is relatively simple.

Renaming variables

Now that the script is readable, it is clear that this piece of code is collecting information about the screen size, the local user agent and whether or not Java is enabled within the browser.

More Complex Scripts

The example above is a relatively simple one; however, the principles remain the same when deobfuscating more sophisticated scripts that may appear on websites worldwide. Another useful tool for more complex scripts is http://jsnice.org/, which can perform statistical renaming, type inference, and basic deobfuscation on scripts submitted.

When attackers encounter more advanced cases of script obfuscation, they turn to tools that allow them to programmatically reverse engineer. These techniques rely on something called Abstract Syntax Trees (or ASTs) – which are an abstract representation of the structure of a program. Open source tools like ESTree allow attackers to deobfuscate scripts by first converting them to an AST and then making changes to it to reverse the obfuscation one layer at a time. Not only is this effective against open source obfuscation tools, but it also allows attackers to adapt easily if the obfuscation changes.

These deobfuscation techniques are used against the defence platforms that bot mitigation vendors provide their customers. Unfortunately, many of these platforms employ obfuscation that only makes the scripts difficult to read at first glance (i.e. minified).  Even the more advanced obfuscated JS scripts are little match for AST-based reverse engineering.  

Uncovering Intent to Reverse Engineer Bot Defenses

The overarching theme here is that defenders use obfuscation to hide the true intent of their scripts. Meanwhile, attackers use deobfuscation to reverse the process and reveal that intent. For bot mitigation vendors, long-term efficacy is determined by having the upper hand, which requires building superior obfuscation systems that far exceed what attackers are willing to spend (time and money) deobfuscating them.

Kasada has developed an elegant solution to ensure that we have the upper hand. We designed an obfuscation system that cannot be reverse engineered using open source or off-the-shelf JavaScript deobfuscation tools. How did we do this? — by shifting the obfuscation method from JavaScript to bytecode with our own proprietary interpreter. This immediately changes the skillset and tools that an adversary needs to successfully reverse engineer a bot mitigation solution. To combat adversarial retooling, we designed our scripts to constantly morph with a different obfuscation each time they load so any prior attempts at uncovering our sensor code intent are nullified.

World-class obfuscation is absolutely key to long-term efficacy. More than 85% of Kasada customers were using other bot mitigation providers prior to contacting us, and many of them had seen their bot defenses lose effectiveness due to weak obfuscation methods. We’ve taken extensive measures to architect our solution to be just as effective months and years from now, as it is on Day 1.

Request a free threat briefing and demo to get customized insights for your specific web and mobile applications – seeing is believing.

*** This is a Security Bloggers Network syndicated blog from Kasada authored by Sam Crowther. Read the original post at: https://www.kasada.io/javascript-deobsfusction-bot-defenses/