With the escalating race between bot developers and security experts — along with the increasing use of Javascript and HTML5 web technologies — bots have evolved significantly from their origins as simple scripting tools that used command line interfaces.

Bots now leverage full-fledged browsers and are programmed to mimic human behavior in the way they traverse a website or application, move the mouse, tap and swipe on mobile devices and generally try to simulate real visitors to evade security systems.

First Generation

First-generation bots were built with basic scripting tools and make cURL-like requests to websites using a small number of IP addresses (often just one or two). They do not have the ability to store cookies or execute JavaScript, so they do not possess the capabilities of a real web browser.

[You may also like: 5 Simple Bot Management Techniques]

Impact: These bots are generally used to carry out scraping, carding and form spam.

Mitigation: These simple bots generally originate from data centers and use proxy IP addresses and inconsistent UAs. They often make thousands of hits from just one or two IP addresses. They also operate through scraping tools, such as ScreamingFrog and DeepCrawl. They are the easiest to detect since they cannot maintain cookies, which most websites use. In addition, they fail JavaScript challenges because they cannot execute them. First-generation bots can be blocked by blacklisting their IP addresses and UAs, as well as combinations of IPs and UAs.

Second Generation

These bots operate through website development and testing tools known as “headless” browsers (examples: PhantomJS and SimpleBrowser), as well as later versions of Chrome and Firefox, which allow for operation in headless mode. Unlike first-generation bots, they can maintain cookies and execute JavaScript. Botmasters began using headless browsers in response to the growing use of JavaScript challenges in websites and applications.

[You may also like: Good Bots Vs. Bad Bots: What’s The Impact On Your Business?]

Impact: These bots are used for application DDoS attacks, scraping, form spam, skewed analytics and ad fraud.

Mitigation: These bots can be identified through their browser and device characteristics, including the presence of specific JavaScript variables, iframe tampering, sessions and cookies. Once the bot is identified, it can be blocked based on its fingerprints. Another method of detecting these bots is to analyze metrics and typical user journeys and then look for large discrepancies in the traffic across different sections of a website. Those discrepancies can provide telltale signs of bots intending to carry out different types of attacks, such as account takeover and scraping.

Third Generation

These bots use full-fledged browsers — dedicated or hijacked by malware — for their operation. They can simulate basic human-like interactions, such as simple mouse movements and keystrokes. However, they may fail to demonstrate human-like randomness in their behavior.

[You may also like: 5 Things to Consider When Choosing a Bot Management Solution]

Impact: Third-generation bots are used for account takeover, application DDoS, API abuse, carding and ad fraud, among other purposes.

Mitigation: Third-generation bots are difficult to detect based on device and browser characteristics. Interaction-based user behavioral analysis is required to detect such bots, which generally follow a programmatic sequence of URL traversals.

Fourth Generation

The latest generation of bots have advanced human-like interaction characteristics — including moving the mouse pointer in a random, human-like pattern instead of in straight lines. These bots also can change their UAs while rotating through thousands of IP addresses. There is growing evidence that points to bot developers carrying out “behavior hijacking” — recording the way in which real users touch and swipe on hijacked mobile apps to more closely mimic human behavior on a website or app. Behavior hijacking makes them much harder to detect, as their activities cannot easily be differentiated from those of real users. What’s more, their wide distribution is attributable to the large number of users whose browsers and devices have been hijacked.

[You may also like: CISOs, Know Your Enemy: An Industry-Wise Look At Major Bot Threats]

Impact: Fourth-generation bots are used for account takeover, application DDoS, API abuse, carding and ad fraud.

Mitigation: These bots are massively distributed across tens of thousands of IP addresses, often carrying out “low and slow” attacks to slip past security measures. Detecting these bots based on shallow interaction characteristics, such as mouse movement patterns, will result in a high number of false positives. Prevailing techniques are therefore inadequate for mitigating such bots. Machine learning-based technologies, such as intent-based deep behavioral analysis (IDBA) — which are semi-supervised machine learning models to identify the intent of bots with the highest precision — are required to accurately detect fourth-generation bots with zero false positives.

Such analysis spans the visitor’s journey through the entire web property — with a focus on interaction patterns, such as mouse movements, scrolling and taps, along with the sequence of URLs traversed, the referrers used and the time spent at each page. This analysis should also capture additional parameters related to the browser stack, IP reputation, fingerprints and other characteristics.

Read “The Ultimate Guide to Bot Management” to learn more.

Download Now