
How DataDome Protected an American Luxury Fashion Website from Aggressive Scrapers
In this article, we cover the details of an aggressive scraping attack that targeted an American luxury fashion website. By the end of the attack, which lasted only one hour, more than 3.5 million malicious requests from scrapers had been stopped by DataDome’s protection.
Key Metrics
For one hour total—6:10 to 7:10 CEST on Apr 11—the product pages of a luxury fashion website were targeted in a scraping attack.
The attack included:
Scraping Attack Overview
The graph below (Figure 1) represents the bot traffic detected during the one-hour attack by our detection engine. The attack started off at its most strong, and slowly lost steam over the course of the hour as attempts were rebuffed. At the start of the attack, between 85K and 95K requests were made per minute; by the end, the number was closer to 50K.
Figure 1: Number of scraping attempts handled by the DataDome bot detection engine over time during the attack.
Distribution of the Attack
Over the length of the attack, the attacker used many different user-agents to attempt to evade detection. Figure 2 represents the number of scraping attempts made by the user-agents used by the attacker per minute.
Figure 2: Number of user-agents used to make scraping attempts over time during the attack.
Attack Indicators of Compromise (IoCs)
The attack was distributed with 125K different IP addresses, and the attacker used many different settings to evade detection:
- The attacker used multiple user-agents—roughly 2.8K distinct ones—based on different versions of Chrome, Firefox, and Safari.
- Bots used different values in headers (such as for accept-language, accept-encoding, etc.).
- The attacker made several requests per IP address, all on product pages.
However, the attacker didn’t include the DataDome cookie on any request, meaning JavaScript was not executed.
How was the attack blocked?
Thanks to our multi-layered detection approach, the attack was blocked using different independent categories of signals. Thus, had the attacker changed part of its bot (for example, fingerprint or behavior), it would have likely been caught using other signals and approaches.
This attack was distributed and aggressive, but activity was blocked thanks to abnormal behavior made by each IP address:
- Number of user-agents: The bot made requests with multiple user-agents per IP address—which is not likely behavior for a human user.
- Lack of DataDome cookie: The attacker made multiple requests without the DataDome cookie on the product pages. Human users would have had this cookie.
Conclusion
Scraping attacks—especially ones like this, where millions of requests are coming at your website in a short amount of time—cause massive drains on your server resources, and come with the risk of content or data theft that can lead to negative impacts on your business. These attacks are becoming increasingly sophisticated as bot developers have more tools available to them, and basic techniques are no longer enough to stop them.
DataDome’s powerful multi-layered ML detection engine looks at as many signals as possible, from fingerprints to reputation, to detect even the most sophisticated bots. Our additional challenges, DataDome CAPTCHA and Device Check, add an extra layer of security while safeguarding the customer experience. Keeping up with bots evolving fingerprints, such as proxy usage, is key to fighting today’s main threats—and DataDome can handle it.
To get a better look at how DataDome can stop scraping attacks, book a demo today.
*** This is a Security Bloggers Network syndicated blog from DataDome authored by Antoine Vastel. Read the original post at: https://datadome.co/threat-research/how-datadome-protected-a-luxury-fashion-website-from-aggressive-scrapers/