The European Union’s General Data Protection Regulation (GDPR) is almost upon us and while businesses are scrambling to ensure they are compliant, another discussion is happening within the information security space among analysts: What’s going to happen to WHOIS? Greatly celebrated for its ability to form connections and break open cyberthreat investigations, WHOIS may—or may not—go away entirely due to GDPR. One thing’s for sure, it won’t remain what it is today.
For anyone not following the ICANN news or registrar changes, the concept of losing WHOIS may come as a surprise. The reason regulators have their sights on WHOIS centers around the changes to what’s considered personal or private information by GDPR. WHOIS, commonly thought of as the phone book of the internet, serves as a registry of personal information for those who’ve registered domains on the internet and is available to anyone for query and considered a big leak of privacy.
To the casual observer, it makes sense to remove WHOIS from the public—or at the very least, hide data deemed personal. In doing so, these changes make it difficult for cyberthreat analysts to differentiate between legitimate, compromised and malicious domains. Additionally, without point-of-contact information for a domain owner, it’s even more difficult to communicate when a website may be compromised or infringing on a company’s trademarks or brand.
Some of you may be thinking to yourself, “Well, my domain is privacy protected, doesn’t that already hide contact details?” and the answer is, yes. Over the past few years, analysts have been seeing a rise in the use of privacy protection services, which ultimately render the analytical content of the WHOIS record less useful, but this is not the norm for the tens of thousands of domains being registered every day.
One proposal to minimize WHOIS disruption, while still respecting privacy concerns, would be requiring individual email addresses to be hashed using the same encrypted hash algorithm across databases. The idea being that the registrant email would be hashed uniformly allowing for analysts to pivot off it, while still obscuring the personal email address itself.
As an experiment, we implemented an extreme version (hashing all the fields) of this concept and demonstrated how connections could still be made, but that a lot of contextual data is lost. Furthermore, there is no consensus that providing this pivoting mechanism in a public WHOIS directory would be GDPR-compliant, as it may allow connections to be drawn that would identify a person not otherwise identifiable.
Not all hope is lost as we await the fate of WHOIS, whether it’s going away completely, or a new accredited access model will take its place. RiskIQ and many others within the space have recognized the value in having multiple data sets to aid in threat investigations. RiskIQ currently has 11 data sets beyond WHOIS including passive DNS, SSL certificates, subdomains, OSINT, host pairs, trackers and more. While these data sets aren’t a complete substitute for WHOIS, they often surface more information or connections that would have otherwise gone unnoticed.
We believe that doing any work—good or bad—on the internet will result in “signals,” pieces of information generated from performing any action, that can then be used to form analyst connections. Using a process we define as “Infrastructure Analysis,” it’s possible for anyone to use a starting indicator (such as an IP address) and easily pivot around to discover related entities.
In the above image, we define the starting point as a piece of malware. Within that malware, maybe we identify an IP address and an SSL Certificate used to encrypt command and control traffic. Maybe that SSL Certificate includes a domain for which it was issued and an IP address for where it was hosted. Finally, maybe that IP address has a different domain connected to through passive DNS or the domain has a unique tracking script within the web page it’s hosting.
Security teams who subscribe to using more data sets in their investigations know the value of forming chains, like the one described above. More data ultimately results in more connections or more supporting evidence for an analyst hypothesis. If WHOIS continues to go “dark” temporarily—and we hope it doesn’t completely—we are, relatively speaking, still in a great position to enable defenders to protect their organizations and accelerate their investigations.