In the world of web applications, bots are nothing new — in fact, bots are almost as old as the Internet itself. Without bots, search engines like Google wouldn’t exist, especially when you consider that today’s indexed web is over 6 billion pages.
Bots are programs that run automated tasks (scripts) on websites and APIs, such as scraping content for indexing purposes, collecting inventory and price information to be displayed on partner sites, or testing website security at regular intervals. While there are many “good” bots that are used on the web for legitimate purposes, there are also bots that exist to carry out activities with malicious intent.
The emergence of bad bots
Over time, the Internet has shifted from a place to get academic papers or pictures of cats and become an integral part of our daily lives. Today, we do everything online from banking, shopping, booking travel, and playing games to communication, news, and many other things. While the modern web brings us immense benefits, it also provides an attractive avenue for criminals.
Take for example, buying tickets to a concert. Before sites like TicketMaster existed, you would have to physically go to the box office and wait in line or spend hours on hold until it was your time to choose your seats. Sometimes, ticket scalpers would buy a large volume of tickets to sell for a profit.
Even though most ticket sales are now primarily done online, there are still scalpers at work — but instead of buying tickets manually, they leverage bot software, specifically written to purchase tickets in bulk and complete transactions at a rapid pace.
Bots used to bulk purchase tickets for scalping is an example of a bad bot. Other examples include the following:
- Credential stuffing bots: These bots test large lists of stolen credentials, often gained from website breaches, against other sites, such as banking, commerce, or gaming. Attackers leverage the fact that most users use the same credentials across multiple sites, and once they gain access to an account they will steal sensitive data or use it to conduct fraud.
- Inventory holding bots: These bots perform the steps of purchasing goods without actually completing the transaction. For example, a bot may add a popular item to a shopping cart, causing it to be removed from inventory while the transaction is completed. Normally, when a transaction is not completed, the item is returned to inventory.However, inventory holding bots perform this process over and over again, causing significant problems for retailers, especially during busy times like Black Friday sales.
- Price scraping bots: These bots crawl a site capturing the prices of goods being sold. This data can then be used by competitors, allowing them to always offer lower prices or create price-comparison websites. In both of these cases, the purpose is the same — to undermine the competitiveness and value of the website being scraped.
Sophisticated bots can outsmart traditional bot protection solutions
As the use of bots, both for legitimate and malicious purposes has increased, organizations have implemented technologies to try and control them. Popular tools, such as Captcha, are commonly used within web applications and origin-based products are frequently installed on servers to detect and deter bot activity.
Origin-based bot mitigation typically uses traffic analysis to generate blacklists for known bot sources. This was effective in the past since bot operators used to rely on rented servers within data centers in order to generate enough bandwidth and processing power to run their programs.
In addition to aggregating known bot sources, other detection techniques are often applied to identify non-human traffic, such as looking for irregular patterns like navigating a large number of pages quickly or continuously hitting a single URL.
As bot detection and mitigation has improved, bot operators have also moved to make their programs more sophisticated, adopting techniques to disguise their activity and prevent themselves from being blocked. Sophisticated bots for example are able to heavily simulate human behavior by navigating naturally, even entering text into the search fields with spelling mistakes.
The use of botnets have increased, too. Botnets are collections of exploited end-user computers or IoT devices which have had malicious command and control software installed onto them by an attacker. There are millions of exploited devices on the web with most of them sitting behind consumer ISPs. By utilizing machines within a botnet, bot operators can have their traffic come from many different sources and avoid automated traffic originating from a blacklisted IP address.
Fingerprinting the browser
While origin-based analysis remains an essential part of bot protection, it is becoming less effective as bots become more sophisticated. For many bots, IP address blacklists no longer help as operators have tuned their bots to circumvent traffic analysis techniques.
Newer and more sophisticated bots are able to avoid origin detection methods by manipulating cookies and headers, or by forging their environmental characteristics, such as the user agent.
Bot fingerprinting collects information from the client, or browser by interrogating the behavior and characteristics of an endpoint directly. Client-side technology is able to detect if a bot attempts to tamper with cookies, headers, or user-agent. When this information is combined with origin traffic analysis, it enables businesses to determine with high accuracy whether a visitor is human or automated.
Endpoint data is the next generation of bot mitigation
As the cat-and-mouse game continues between bot maker and bot mitigator, the only real way to protect against modern and sophisticated bots is to have a presence on the endpoint.
*** This is a Security Bloggers Network syndicated blog from Instart blog RSS authored by Jon Wallace. Read the original post at: https://instartstage.wpengine.com/blog/leveraging-endpoint-data-for-bot-mitigation