SBN

How New Headless Chrome & the CDP Signal Are Impacting Bot Detection

Author’s Note: The detection methods and ideas noted in this piece are based on research performed by Eloi Bahuet, a threat researcher at DataDome. Thank you for sharing your expertise!

Headless Chrome’s latest update has brought it dangerously close to achieving a perfect browser fingerprint. In its early days, Headless Chrome was distinctly separate from the “headful” Chrome, filled with its own quirks and bugs. But now, with both the new Headless Chrome and Chrome sharing the same codebase, distinguishing between the two has become a Herculean taskas they have almost the exact same browser fingerprint.

Once an attacker uses page.setUserAgent() to change their user agent and the --disable-blink-features=AutomationControlled argument to get rid of navigator.webdriver, there are very few inconsistencies left in the fingerprint of Headless Chrome. It’s like trying to find a needle in a haystack, but the needle looks just like the hay!

This evolution significantly raises the bar for detection, prompting the need to understand and leverage new techniques such as detecting the Chrome DevTools Protocol (CDP) side effects to stay ahead of sophisticated bot frameworks.

What is CDP?

The Chrome DevTools Protocol (CDP) is a set of APIs and tools that enables developers to interact programmatically with Chromium-based browsers. It allows for debugging, profiling, and inspecting web applications by providing access to the browser’s internals. CDP is also the underlying protocol used by the main bot frameworks—such as Puppeteer, Playwright, and Selenium—to instrument Chromium-based browsers. Thus, being able to detect that a browser is instrumented with CDP is key to detect most modern bot frameworks.

CDP detection targets the underlying technology used for automation rather than specific inconsistencies and side effects added by a particular bot framework. This provides us a more generic fingerprinting detection, even for unknown automation frameworks—including the ones that try to stay under the radar by providing anti-detect features.

In addition to being able to detect all kinds of bot frameworks, CDP detection is efficient both for Chrome and Headless Chrome.

How can we detect browsers automated with CDP?

The detection technique we present below leverages the fact that the automated browser and the automation framework need to serialize data when they communicate with WebSocket.

We want to create a JavaScript (JS) function that enables us to observe a situation where some data is serialized only when a browser is automated using CDP. Thus, we need to:

  1. Provoke the serialization of an object, but only when CDP is being used.
  2. Detect that an object has been serialized.

Provoking Object Serialization

To provoke object serialization we leverage the Runtime.consoleAPICalled event, which relays the information that was logged using one of the window.console methods.

Note: Chromium-based browsers dispatch Runtime events only when they have received the Runtime.enable command from the client, which is the case of automation frameworks.

To restrict the serialization only to situations where CDP is being used, we leverage the fact that Chrome buffers the console messages when the DevTools (CDP) are not open.

Detecting Object Serialization

To detect that an object has been serialized, we could have defined a JS getter on a random JS object and used a console method on it. However, in this situation, Chrome executes the getter instantly and caches its result—without waiting for a serialization.

There is one exception to this rule: the Error object’s stack property. This is a non-standard property, which means that browser engines are free to implement it how they see fit. V8, the engine used by Chrome, handles it as follows:

Unlike Java where the stack trace of an exception is a structured value that allows inspection of the stack state, the stack property in V8 just holds a flat string containing the formatted stack trace. This is for no other reason than compatibility with other browsers. However, this is not hardcoded but only the default behavior and can be overridden by user scripts.

For efficiency stack traces are not formatted when they are captured but on demand, the first time the stack property is accessed.

In Chrome, the stack property of an Error can be overridden, ensuring its value is not read until it’s needed.

Creating the Function

Now that both of our conditions are met, we can create a JavaScript function that detects when the stack property of an error is accessed:

 var detected = false;
 var e = new Error();
 Object.defineProperty(e, 'stack', {
    get() {
        detected = true;
    }
 });
 console.log(e);

// store value of `detected`

The value for “detected” will be true on Chromium browsers that have a CDP client connected (and that has sent the Runtime.enable command), and false otherwise.

While this technique detects automated Chromium-based bots, it also detects users with DevTools open, which may create false positives. In theory, detecting if the DevTools UI is open could have helped handle these edge cases. However, bot developers noticed that certain anti-bot software vendors added a rule to check whether the DevTools UI was open—so they began to automatically start their bots with args: ['--auto-open-devtools-for-tabs'].

A forum post showing a bot developer automatically opening the DevTools UI

How has the updated Headless Chrome impacted anti-detect bot frameworks?

Following the release of the new Headless Chrome and the use of CDP detection in the wild, bot developers have started to find ways to bypass this detection.

The CDP detection test is well-documented and is now part of several test suites for anti-detect bot frameworks, as seen in this repository that provides a set of tests to ensure a ChromeDriver-based bot isn’t detected:

A code snippet showing a test to ensure a ChromeDriver-based bot is not detected.

These changes also led to the creation of new anti-detect bot frameworks, such as no driver being announced as the successor of undetected ChromeDriver, and Selenium Driverless.

To avoid being detected, these bot development frameworks decided not to rely on ChromeDriver and Selenium. Instead, they implement all the usual automation functions using low-level CDP commands that do not leverage Runtime.enable.

How to Detect the New “Anti-Detect” Bot Frameworks

The fact that these new frameworks aim to erase the main CDP side effects and fingerprint inconsistencies doesn’t mean there are no inconsistencies left—it just means they’re more difficult to find. Thus, continuing to research other fingerprinting signals is still key to detect these automation frameworks in a single JS execution.

However, this highlights the need for a multi-layered approach, and to avoid relying solely on a single category of signals (such as JS fingerprinting) for bot and fraud detection.

In addition to fingerprinting signals, your detection should leverage:

  • Behavioral Analysis Signals: Using both client-side interactions (mouse movements, touch events) and server-side browsing patterns (sequences of requests, browsing graph).
  • Reputational Signals: IP and session reputation, proxy detection.
  • Weak Signals: Such as the time of the day, request origin, and consistency between a user’s IP location, time zone, and languages.

Detect Malicious Traffic with DataDome

DataDome’s powerful, multi-layered bot and online fraud detection platform uses multiple client-side and server-side signals to identify even the most sophisticated and sneaky attackers. Our integrated CAPTCHA and invisible challenge, Device Check, add extra layers of security to keep your business safe. Our Ad Protect and Account Protect features improve the efficacy of our existing protections to fight the latest ad and account fraud threats.

Want to learn more about how DataDome can detect sophisticated fraudsters using the new Headless Chrome or anti-detect bot frameworks? Book a demo today or try DataDome for free.

*** This is a Security Bloggers Network syndicated blog from DataDome Blog – DataDome authored by Antoine Vastel. Read the original post at: https://datadome.co/threat-research/how-new-headless-chrome-the-cdp-signal-are-impacting-bot-detection/