Domain squatting, typosquatting and homograph detection with Swimlane

Introduction

Domain Squatting, typosquatting and IDN homograph attacks are a combination of techniques used by malicious actors to harvest credentials from an organization, distribute malware, harm an organization’s reputation, or otherwise maliciously impersonate a legitimate domain.

Domain squatting, typosquatting and homograph detection with Swimlane - 1

Techniques

These various attacks—which will be referred to collectively as “squatting” in this post—are a family of attacks wherein a user is fooled into interacting with a legitimate-looking website with a legitimate-looking domain/URL. Any legitimate domain can be “squatted,” with its clone disguised as a legitimate domain in several ways, including:

  • Domain squatting: An actor simply registers a target’s predicted domain name before the target organization has a chance then holds on to it for a monetary or nefarious purpose.
  • Typosquatting: An attacker registers a domain similar to the target domain in appearance, keyboard typo likelihood, or tweaked TLD, and skims traffic that people accidentally direct that way.
  • IDN homograph attacks: Attackers register a domain that is visually similar or identical to a registered target domain through the International Domain Name protocol, which allows for the display of Chinese, Arabic, Korean, Amaric, etc. characters in domain names. Some characters, like the Russian “а,” appear identical to certain English letters, meaning “apple.com” (English “a”) and “аpple.com” (Russian “а”) can resolve to entirely different servers, with end users none the wiser.

Impact

These techniques allow for attackers to clone your domains to skim credentials (often redirecting to the real target page after to avoid suspicion and detection), to distribute malware alongside legitimate documents from your site, or otherwise impersonate your organization to your customers or users.

Challenge: Detection

The first step in monitoring these potential squatting domains related to the domains you control is finding them. But with hundreds of thousands of new domains being registered on a daily basis, how in the world can we find permutations of “myorganization.com,” given the seemingly infinite number of combinations we’d encounter even before introducing foreign characters, substituting zeros for Os, et cetera? Then, once a potential squatting domain is discovered by name, it must be investigated for similarity to the legitimate website. Clearly, a scripted solution is needed.

Challenge: “Sleeper Cells”

Often, these domains are registered months before an attack goes live, either failing to resolve, or worse, harmlessly redirect to the target domain for months to build up a false sense of trust in the squatting domain, potentially even becoming included in hard-coded hyperlinks on other sites that will persist after the attack component goes live. These “sleeper cells”—by harmlessly redirecting, or appearing as a benign but unrelated organization’s webpage, or resolving to a hosting service’s domain parking page, for months before they activate—deter detection and can be an impossible time sink for analysts to check manually on a periodic basis.

Automation to the rescue!

Swimlane can ingest the list of newly registered domains on a daily basis and compare them against a list of domains you wish to monitor. Three comparisons are made between each newly registered domain and each of the domains you wish to monitor. The comparisons are:

  • CONTAINED_IN: The newly registered domain CONTAINS the monitored domain (i.e. “netflixemail.com” CONTAINS “netflix” from “netflix.com.”
  • CONFUSABLE: The newly registered domain resembles the monitored domain via IDN Homograph Attack or via “confusable” characters, such as lowercase “L” for capital “i”, or zeroes for Os.
  • LEVENSHTEIN DISTANCE: The newly registered domain is very similar to the monitored domain, save that the text is transformed slightly. The Levenshtein Distance is how many changes must be made to one string of characters to transform it into a second string of characters. If the strings are similar enough, Swimlane will register a hit.

Once Swimlane has identified potential squatting domains, it begins attempting to take snapshots of those domains. Once a day, the potential squatting site, its SSL certificates, server information, WHOIS information, etc. is ingested into Swimlane, and the contents of the retrieved page are compared against previously stored contents. If the page is sufficiently similar, no additional action is needed. But if a web hosting domain parking page suddenly turns into a full webpage, or the page changes substantially in any other way, your analysts will be alerted to investigate for similarity to the monitored domain.

Test example: skype.com

To test out the use case, I began monitoring some domains that are frequently cloned for credential skimming, such as netflix.com and skype.com. The monitored domain skype.com returned a number of results over the next few days as new domains were registered:

Looking at the last entry in the list, sky-pp.com, we can see that it received an HTTP 200 OK, indicating the page loaded successfully. Let’s open the Potential Squatting Domain record for sky-pp.com.
We can see that there is a snapshot for sky-pp.com (top right), as well as whether we are actively monitoring this domain (yes), the most recent status check (2019-08-20), and the search term matching algorithm used (bottom right, Levenshtein distance of 2). Let’s look at the snapshot.
Right off the bat, we can see from the screenshot that it is a clone of the Skype login page. On the left side, we can see that it resolves to the same domain it started at (sky-pp.com). On the top, we can see our investigation options. Let’s look at the Domain/Server Information tab.
The SSL certificate information certainly looks fake, with the certificate registered to the email address “root@ip-172-32-23-217”. Next we can see what resources were extracted from the page — source HTML, any hyperlinks present, and any externally hosted resources that were loaded.

Conclusion

Swimlane can be leveraged in this way to proactively detect squatting attacks against your organization, both by detecting domains as they are registered and by monitoring those domains for changes after they begin resolving.

Happy hunting!


*** This is a Security Bloggers Network syndicated blog from Swimlane authored by Nick Tausek. Read the original post at: https://swimlane.com/blog/domain-squatting-typosquatting-and-homograph-detection-with-swimlane-1/