Why Management of Good Bots is Crucial for Organizations

In a previous article, I outlined the results of a study conducted by our company that showed the prevalence of bad bots across selected industries in our customer base, along with an examination of their attack techniques and origins. Malicious bots have become extremely sophisticated in their ability to masquerade as human visitors and evade conventional web security measures. These bad bots will continue to pose ever-growing threats to websites, applications and APIs, and will always remain in focus for webmasters looking to mitigate their impact on their organizations.

However, in certain circumstances, such as when good bot traffic starts approaching or even surpassing the limits of server and bandwidth capacity, even good bots can and do cause harm, just like bad bots. This is why a holistic approach that manages bots (both good and bad) is the most effective one and allows webmasters and security chiefs to take specific actions on each type based on organizational needs and other factors.

Types of Good Bots

Good bots encompass a wide range of functions and capabilities. Let’s take a brief look at some of the major classes of good bots and what they do.

Search Engine Crawlers

Bots such as Googlebot, Bingbot and Baidu Spider crawl web pages to index them for search engines such as Google and Bing. Website administrators can specify non-binding rules in their ‘robots.txt’ file for crawlers to follow while indexing web pages, such as their crawl rates and pages or sections that they should not index.

Partner Bots

Partner bots provide essential services and information to websites and their customers. This category includes bots run by vendors that provide transactional/ CRM/ ERP services, geo-location, inventory and background checks and other business-related services. Alexa, Slackbot and IBM Watson are examples of this type of bot.

Social Network Bots

Social networks operate bots to provide visibility to their clients’ websites, drive engagement on their platforms and some can even carry out chat conversations with users to provide information and services. Examples of this type include Facebook Bot, Pinterest Bot and Snapchat.

Monitoring Bots

These bots, such as Pingdom, AlertBot and StatusCake, are used to monitor the uptime and system health of websites by periodically checking and reporting on page load times, downtime duration and so on.

Backlink Checker Bots

Backlink checkers analyze the inbound URLs on a website to provide marketers and SEO specialists with insights to help them optimize their content and campaigns. These include bots such as SEMRushBot, UAS Link Checker and AhrefsBot.

Aggregator/ Feed Fetcher Bots

These bots, such as Google Feedfetcher, Superfeedr and Feedly, collate information from websites and provide users with customized news, alerts, and other desired content.

Breakdown of Overall Traffic

First, let’s look at the data we gathered on the overall percentage of traffic we observed in the second half (H2) of 2018 on four industries: online travel agencies, e-commerce, classifieds and media and publishing. Humans comprised nearly 74 percent of traffic, while good bots totaled nearly 17 percent and bad bots nearly 10 percent.

Crawler Traffic Distribution

While good bot statistics can significantly vary across time periods and industries, our observation found that search engine crawlers comprise about 55 percent of all good bot traffic, with other good bots making up the rest.

Google bots made up nearly 68 percent of search engine bots across our client base; Bing bots came in at nearly 26 percent and bots from Yandex, Yahoo and Baidu made up the rest.

Good Bot Distribution By Industry

An industry-wise breakdown of good bot traffic shows us that classifieds and marketplaces business get the highest level of aggregator bot traffic, followed by online travel agencies (OTAs). Partner bots are mostly seen on e-commerce sites and OTAs. E-commerce businesses also attract the greatest numbers of social network bots, followed by media and publishing sites, while OTAs come in last, with a tiny 0.1 percent of their good bot traffic consisting of social network bots.

How to Manage Good Bots

Geographic Restrictions

It is often advisable to block good bots that are irrelevant or unnecessary for your business. Geographical restrictions allow webmasters to block every class of bot from certain countries. Let’s say, for example, that your business does not operate in Russia. Blocking Russian bots including search engine crawlers such as Yandex makes sense if your SEO and market strategies do not include the Russian market.

Detecting Spoofed Search Engine Crawlers

Malicious entities often deploy bots that masquerade as crawlers such as the Googlebot to evade basic web security systems and carry out content scraping and competitive intelligence gathering. Measures such as doing a reverse DNS lookup, or comparison of the behavior of suspected spoofed crawlers with the behavior of real crawlers, can help counter such attack strategies. A dedicated bot management solution will generally feature a list of bad bot signatures from across its client base to secure websites and apps against spoofed crawlers and various other types of attacks.

Blocking Good Bots Can Also Have a Negative Impact

It’s imperative for webmasters and security specialists to keep an eye on the types of bots being blocked. Non-specialized security systems or in-house anti-bot solutions may sometimes block good bots such as search engine crawlers and lead to a negative impact on SEO. Along the same lines, blocking partner bots or social network bots could also lead to undesirable results for your business and your customers. Make sure you whitelist the essential good bots so that they can function unhindered!

Adopt an Industry-Specific Approach

I recommend webmasters, security experts and marketers take an industry-specific approach when devising bot management strategies. Depending on your business objectives and other factors specific to your organization, it’s generally a good idea to prioritize business-critical good bots such as partner bots or search engine crawlers over other types of bots.

Use a Dedicated Bot Management Solution

A dedicated solution can not only provide comprehensive insights into every type of bot traffic, but also provides insightful analytics for marketers to optimize their strategies and forecast changing market trends.

The Takeaway

Management of bots (both good and bad) cannot be an isolated activity in today’s complex and interconnected web and app ecosystem. With the currently intensifying focus on bot management for websites, apps and APIs, we see virtually every leading organization going beyond basic security systems and looking at bot management in a holistic way and as a keystone component in their overall security suite.

  • Good bots can (under certain circumstances) have a negative impact on organizations.
  • Webmasters, security chiefs and marketers must adopt industry-specific approaches to deal with every type of bot.
  • Marketers can analyze trends in good bot traffic to obtain deep insights to propel their marketing strategy.
  • A dedicated bot management solution is an essential component in a modern web and app security system, and helps mitigate a vast and growing range of threats that affect organizations of every kind.

Rakesh Thatha

Avatar photo

Rakesh Thatha

A frequent speaker at various national and international forums, Rakesh Thatha is the CTO of ShieldSquare. He writes and speaks on cybersecurity, automated threats, and prevention measures to combat bad bots.

rakesh-thatha has 5 posts and counting.See all posts by rakesh-thatha