Quantifying the impact of the Twitter fake accounts purge – a technical analysis

This post provides an overview of the impact of the Twitter 2018 accounts purge through the lens of its impact on 16k of Twitter’s most popular accounts.

Overall, we found that on average, popular accounts lost 2.8 percent of their followers bases due to the purge. In terms of raw numbers, this represents a collective loss of over half a billion followers (546M). Amusingly, @Twitter was the biggest loser in absolute numbers, with 7.7M followers lost, followed by @katperry, the most popular account on the platform, who lost 2.9M followers.

The purge impact is far from even, with some accounts losing far more than others in term of follower bases. Some (unlucky?) accounts had over 50 percent of their followers bases vanish, including San Francisco-based coffee brand
Pooki’s Mahi
, who saw 84 percent of its followers base disappear. As of July 20th, it currently has 95k followers, down from 596k before the purge.

drjamescabot

Some accounts, such as
@DRJAMESCABOT
, stand out both in terms of raw followers lost (-2.8M) and the percentage of followers base lost (-70%). In contrast, while former president
Barack Obama
lost 2.2M followers, this only represents two percent of his followers base—a little better than the average 2.8 percent decrease in our dataset.

The most popular Twitter users collectively lost over half a billion followers (546M) due to the purge. While this represents an average decrease of 2.8 percent of their followers bases, some accounts lost more than 70 percent of their followers.

To provide the most comprehensive technical analysis of the account purge, this post covers the following topics in turn:

  1. Dataset overview
    : What the dataset looks like and how the data was collected and processed
  2. Overall impact
    : A high level overview of the purge impact on followers bases at a macro level.
  3. Impact by category
    : Which categories (e.g., Music, Brand, e-Celebrity) were most affected by the purge.
  4. Most impacted users
    : Which users lost the most followers and the largest percentage of followers base.
  5. Most impacted users per category
    : Which users in each popular category (Politics, News, Website …) lost the most users or the largest percentage of their followers base.

Disclaimer: This post and the related analysis was done in our spare time and does not represent our respective employers’ points of view.

With this out of the way, let’s get started!

Dataset overview

Here is a brief overview of the data collected.

Methodology

To measure the impact of the purge, we crawled 16,474 of the most popular accounts on the day of the purge announcement (July 12th, 2018), and took this as our baseline. We then recrawled the same accounts three days later (July 14th, 2018) and again seven days later (July 19th, 2018).

The reason for the second re-crawl and delaying this post was to ensure that we had captured the entire effect of the purge, as Twitter stated in the announcement that the purge would take place over the next several days. Comparing the two crawls done post-purge also enabled us to detect the top “rebounders,” that is, users who lost a lot of followers and then regained them very quickly.

For each account, we collected the followers count, number of likes, and following count. While the purge is mostly visible through loss of followers, there are some instances, discussed later in the post, where accounts lost both a huge amount of followers and following users, which suggests that they were enrolled (willingly or not) in some sort of influence trading.

To enable reproducibility, the dataset collected for this study and the related Jupyter notebook are available upon request while we work on the full paper and will be made public afterward.

Dataset overview

Followers distribution

followers distribution

Overall, the bulk of the accounts in our dataset had between 10k and 1M followers before the purge. The 3,828 most popular accounts had over 1M followers (23 percent of the accounts) with the top 314 (the top two percent) having over 10M followers. We also have a few accounts (1147, or six percent) with a small follower base of less than 1,000 followers.

Important limitation: It is likely that many of the suspended accounts followed many of the same popular accounts that we crawled and are therefore counted multiple times. There is no way for us to dedup, as we only collected high level statistics for each of the crawled accounts. Accordingly, this post focuses on evaluating the impact of the purge on following bases and makes no claim as to how many accounts were really suspended.

Category breakdown

To study how various Twitter populations have been impacted by the purge, we combined the categories provided by various ranking sites to bucket accounts into a broad set of categories. While this categorization is far from being accurate, it is good enough for our purposes.

followers distribution per category

Each category has between ~300 and ~50 users. As reported in the chart above, the vast majority of the following base is concentrated in a few categories, including Music (22.8B followers), Movies (1.6B), Sport athletes (1.1B), Politics (987M), and Entertainers (948M).

As noted above, we can assume that the total number of followers by category is grossly overestimated, given that the same follower potentially follows multiple accounts belonging to a given category. Despite this limitation, our ranking allows us to understand where most of the follower base is concentrated and will later help illuminate the difference between the categories that lost the most versus the ones that are truly the most popular (spoiler alert: they are different!).

Categories description

Here is a brief description of some of our major categories for further clarity.

Katy Perry

The Music category is mostly comprised of the accounts of singers, such as
Katy Perry
, who also has the most popular account in our dataset, with 109 M followers before the purge (-3%). The Celebrity category is fairly diverse, with personalities such as
Kim Kardashian
(60 M followers, -3%) and
Bill Gates
(46 M followers, -2%).

Jimmy Fallon

The Movie category mainly includes film directors, actors, and TV show personalities, including
Jimmy Fallon
(51 M followers, -2%). As mentioned earlier, our categorization is imperfect and some of the accounts in the Movie category, such as
Selena Gomez
(57M -2%), would also fit in other categories, such as Music.

Barack Obama

Politics groups U.S. political figures, such as
former President Obama
(103.6 M followers, -2%) and
President Trump
(53,3 M followers, less than one percent lost) with international figures such as
Narendra Modi
(Indian Prime Minister, 43.1 M followers, -1%) and political organizations such as
@PMOIndia
(Office of the Prime Minister of India, 26.6 M, less than one percent lost).

Cute Emergency

The “Other” category covers a swath of accounts (200) that have a high follower count but are hard to classify, such as
@CuteEmergency
(9.6 M), self-described as “critiquing the cutest animals.”

Overall impact of the account purge

This section offers an overview of the overall impact of the account purge. The macro level view adopted here aims at distinguishing the expected or normal purge impact from anomalies or disproportionate impact observed for certain accounts or categories, which we will drill into later.

Absolute follower loss

Overall, seven days after the beginning of the purge, our 16k accounts had 18.5B followers left (down from 19.1B). This represents a collective loss of 545M followers, which amounts to a loss of 2.9 percent of their collective followers bases (the standard deviation being 9.7 percent).

overall-follower-loss-breakdown

As alluded to in the introduction, the most interesting aspect of this account purge is that it had drastically different effects on various accounts’ followers bases. This uneven impact is highlighted in the graph above, which depicts a bucketed breakdown of the number of followers lost. In particular, we identify 1,095 accounts that lost over 100k followers, including 64 that lost over 1M followers.

Over 1,000 accounts lost over 100k followers due to Twitter’s account purge. Sixty-four lost over 1M followers.

Did any accounts gain followers?

Fortnite

Out of the 16,474 top accounts crawled, only 2,622 accounts gained followers within the span of seven days, while pretty much everyone else lost followers. The most notable gain is for
@FortniteGame
(4.8M, +4%), the popular game.

The purge coincided with the final of the World Soccer Cup, which unsurprisingly led to an increase in followers count for many soccer players, including
@AntoGriezmann
(5.2M, +3%) and
@paulpogba
(5.3M followers, +2%), sharing the podium with
@equipedefrance
(3.8M , +2%).

Follower base loss

While looking at the purge’s raw numbers provides a sense of its scale, it is useful to look at the relative loss of followers bases to really understand its impact, as Twitter accounts vary widely in their followers bases to start with.

How to compute the follower base loss and interpret it

Each account’s follower base loss is computed by dividing its followers count on July 19, 2018 (post-purge) by its followers count on July 12, 2018 (pre-purge). For example, if an account had 1,000 followers pre-purge and 900 post-purge, its followers base loss would be: 900/1000=0.1, or 10 percent.

Given, as reported earlier, that the average followers base loss is 2.9 percent, with a standard deviation of 9.7 percent, any loss over 10 percent should be considered a strong anomaly. A loss above five percent might also be anomalous, but only for accounts with a very large user base, which are less susceptible to variance.

Follower loss distribution

overall-follower-base-loss-breakdown

The chart above summarizes the distribution of the relative loss across all the accounts monitored. As expected, the vast majority of the accounts lost between one and five percent of their followers bases.

On the followers gain side, 433 accounts gained followers, and 3,859 accounts lost less than one percent of their followers bases. Accounts that lost one percent or less include
@realDonaldTrump
(U.S. president, 53M),
@PMOIndia
(office of the prime minister of India, 26M),
@imVkohli
(Indian cricketer, 25M),
@ChampionsLeague
(European soccer league 21M), and
@Xbox
(Gaming console 12M).

David Copperfield

On the losing side, 1,549 accounts lost over five percent of their followers bases, including 719 that are considered very anomalous given that they lost over 10 percent of their followers, which is more than the standard deviation. Those anomalous accounts are almost certainly (and unwillingly?) involved in one fake followers scheme or another. Among the users that lost the most, there are quite a few popular accounts with well over 1M followers, including David Copperfield (
@D_Copperfield
), who lost most of his followers base (-68% with 2.2M followers loss).

Note that it is clear that at least some of those accounts were unwilling participants, given that Twitter’s own account suffered from a relative loss of 12 percent.

More than 4.4 percent of some of Twitter’s most popular accounts lost over 10 percent of their followers bases during the Twitter purge, suggesting that they were (unwillingly?) involved in a fake accounts scheme.

Measuring influence trading via following loss

overall-following-loss-breakdown

The purge mostly impacted followers counts, while the following counts remained almost the same, with most accounts losing less than 0.1 percent of the accounts they followed. However, we did find, as visible in the chart above, 358 accounts that lost over 1,000 followings.

Social Plug

Chief among the biggest losers is @ItzBJJ, which describes itself as “a complete Social Media Solution for businesses and individuals looking to increase their social presence.” 100 percent of the 379k users it followed before the purge were wiped out. @ItzBJJ also lost 76 percent of its follower base, initially at 1.4M.

mehmet-burak

Self-proclaimed digital strategist
@BurakTorunResmi
, from Turkey, is the only account to have lost over 1M following (-1.2M), which represented 77 percent of his following base. He also lost 1.6M of his 3M followers (-55% of his followers base) but seems to have regained them quickly, as he is now back to 3M+ followers.

Here is the list of the top five accounts that lost the largest fraction of their following bases and initially had over a 100k followers:

HandleInitial FollowingFollowing lossFollowing base loss
BurakTorunResmi1571K-1208K-77%
Qu_70644K-591K-92%
ItzBJJ379K-379K-100%
udon_lov1300K-372K-29%
TheReal1Witness739K-317K-43%
defjam180K-115K-64%
3GB_1519K-80K-15%
shwood843K-77K-9%
TalhaYilmaz475K-60K-13%
contraenvidia402K-53K-13%

Purge impact per category

This section looks at the impact of the purge on the most popular categories, to showcase how the purge affected some verticals more than others.

Absolute follower loss

followers-loss-distribution-per-category-scatter

Contrasting the number of followers per category with the total of followers lost reveals several discrepancies. The chart above represents the top categories’ popularity ranks in contrast to their loss ranks (ranking by followers count loss). The categories above the diagonal are the least affected by the purge, as their loss rank is inferior to their popularity rank. On the other hand, the ones on the right side of the diagonal are the categories that are more affected by followers loss than expected.

Chief among these categories are: Other (-24M followers), Celebrity (-21M), Brand (-11M), Website (-16M) and e-Celebrity (-18.3M), all of which lost many more followers than expected. The categories that exhibit the largest differences in ranking are somewhat expected, as for users in these categories, popularity is key to their success (especially for e-celebrities, aka Internet famous celebrities).

Relative follower loss

followers-base-loss-distribution-per-category

Looking at the categories ordered by the mean loss of follower base reveals that some categories are clearly anomalous: The median user for the top 10 category in that ranking lost at least six percent of its followers base. The category e-Celebrity dwarfs every other category when it come to followers base loss, with a whopping 12 percent loss for its median user. This suggests once again that fake followers are mostly used by “Internet famous” users and accounts related to online activities.

heatmap

The heatmap above, which displays the followers base loss rate for the top 100 users of the categories discussed above — the redder a cell is, the largest the followers base loss for that user is. Categories are sorted by their user average follower base loss.

Overall this heatmap visualization clearly highlight how endemic the fake followers issue is for the “internet famous” categories which were hit the hardest by the purge in term of follower base loss — This loss is not due to a single user but rather a large amount of users in that category who lost way more followers than the average Twitter users.

Most impacted users

This section dives into which users were most affected by the purge through the lens of diverse rankings.

Accounts that lost the most followers

Top 15 most losers accounts

HandleInitial FollowersNum Follower LossFract. follower base loss
Twitter62.9M-7.8M12%
katyperry109.6M-2.9M3%
DRJAMESCABOT4.0M-2.8M70%
FilippaLinrooos3.5M-2.8M80%
justinbieber106.7M-2.7M3%
ladygaga79.0M-2.6M3%
AdelAliBinAli6.2M-2.4M38%
BIGMONEYMIKE63.0M-2.3M79%
patrickabrar3.3M-2.3M70%
taylorswift1385.6M-2.3M3%
rihanna89.0M-2.3M3%
D_Copperfield3.4M-2.3M68%
BarackObama103.6M-2.2M2%
BRUNOIERULLO3.4M-2.2M64%
britneyspears58.3M-2.1M4%

Overall, all users in the top 15 group lost over 2M followers. Looking at how much of their follower base this loss represent paints a very polarizing picture.

On the one hand, we have a set of accounts including @kateperry, @justinbieber, @ladygaga, and @BarackObama, for which this loss barely represents a fraction of their followers bases (less than three percent), and is totally what is expected on average.

On the other hand, we have a set of accounts for which this loss represents the majority of their followers bases. This includes @DRJAMESCABOT, @FilippaLinrooos, @BIGMONEYMIKE6, @patrickabrar, and @D_Copperfield. While @AdelAliBinAli appears to have lost only 38 percent of his followers base, he is a special case: He regained over 1M followers between July 14 and July 19, as discussed below.

Half of the 15 most popular accounts analyzed lost over 50 percent percent of their followers base as a result of the Twitter purge.

Top 15 of the biggest losers that had fewer than 1M followers

Looking at the accounts that had less than 1M followers and those with less than 100k reveals a completely different picture than what we see from the overall top 15. For those segments, the entire top 15 of the biggest losers is made of accounts that lost the vast majority of their followers bases. This suggests, once more, that fake followers mostly affect “Internet famous” people rather than well-established celebrities.

Accounts that lost most of their followers bases

Looking at the various segments of users that lost the largest fraction of followers reveals the existence of anomalous users across the board, whether they belong to the super popular segment (>1M followers), the semi-popular segment (100k–1M), or even the lesser known users segments (under 100k).

The following interactive table showcase the top 15 largest users for the three aforementioned segments—as reported in the “Fract. Follower base loss” columns, all those users lost at least 50 percent of their following bases.

Users who bounced back during the purge

As discussed earlier, we did crawl users very early after the purge began (July 14th) and a week after it began (July 19th). Contrasting the number of followers between those two crawls gave us the opportunity to detect whether some users were able to get their followers back, and, if so, by which margin.

HandleInitial Followers14th Jul. Followers19th Jul. FollowersMax lossRegained Followers
AdelAliBinAli6.2M2.8M3.8M3400K1024K
AlaattinCAGIL3.2M1.6M2.6M1535K995K
BurakTorunResmi3.0M1.1M1.4M1919K282K
PDChina5.0M4.3M4.3M699K16K
ammr0.7M0.4M0.4M293K7K
Twitter62.9M55.1M55.1M7780K6K
ARTEM_KLYUSHIN1.7M0.9M0.9M737K2K
PaulKagame1.8M1.2M1.2M618K1K
BenjaminEnfield1.5M1.1M1.1M392K1K
NASCARPEAKMX0.1M0.1M0.1M17K1K

As reported in the table above, which showcases the top six rebounders, four of them were able to regain a huge amount of followers in the span of seven days. The most striking example of those massive rebounds is the account of @AdelAliBinAli. Before the purge, this account had 6.2M followers. On July 14, its follower count was down to 2.7M followers (-3.4M / -55%). Between July 14 and July 19, it regained 1.2M followers, and as of July 23rd, this account now has 6.6M followers, which is even more than before the purge and represents 3.9M additional followers gained.

HandleInitial Followers14th Jul. Followers19th Jul. FollowersMax lossPerc. Regained Followers
bugragulcanim1.0M0.6M1.0M423K39%
RdQ881.2M0.8M1.2M481K37%
AlaattinCAGIL3.2M1.6M2.6M1535K32%
AdelAliBinAli6.2M2.8M3.8M3400K17%
BurakTorunResmi3.0M1.1M1.4M1919K9%

Looking at the top 5 rebounder accounts in terms of followers bases reveals that five of them were able to recuperate. The top three were able to recuperate over 30 percent of their followers bases in less than a week.

To put things in perspective,
@AntoGriezmann
(one of the most popular French soccer players, initially at 5.3M) only gained 150k followers over seven days despite being part of the soccer World Cup winning team. Determining how those accounts were able to make such an impressive recovery is left as an exercise to the reader.

Most impacted users per category

Finally, to wrap up this blog post, let’s look at the accounts that lost the most followers and the accounts that lost most of their followers bases for various categories. Use the select box to switch from one category to another.

Accounts that lost the most followers

Accounts that lost most of their followers bases

Thank you for reading this blog post till the end! Don’t forget to share it, so your friends and colleagues can also learn about the impact of the Twitter purge.
To get notified when my next post is online, follow me on
Twitter
,
Facebook
,
Google+
, or
LinkedIn
. You can also get the full posts directly in your inbox by subscribing to the mailing list or via
RSS
.

A bientôt!



*** This is a Security Bloggers Network syndicated blog from Elie on Internet Security and Performance authored by Elie Bursztein. Read the original post at: http://feedproxy.google.com/~r/inftoint/~3/9qi6odje3pU/quantifying-the-impact-of-the-twitter-fake-accounts-purge-a-technical-analysis