You know the feeling, the one that is a nagging doubt, that itch that won’t quieten down? You know, when you’ve just pulled an all nighter to finish the project by the deadline, but that itch is still there, because something is missing. I’ve experienced that way too much recently, and according to my therapist, I need to share what is causing my discomfort.
Our current focus is dealing with PII, for GDPR, CCPA and compliance in general. I’ve been reading up on our competition, analyzing the information that they have placed in the murky swampland called the Internet, the place where no one knows you’re a dog*, and I started to itch. As I researched each vendor, the itch got worse. It was so strong, I would not have been surprised to see it grow arms and start to scratch itself. Every single one of the vendors claims to discover unknown PII (Personally Identifiable Information). Well that’s great. It means that each one of them can connect to a known data source and run some form of analytics against that data source to find PII. And that is a good start, but it was also the start of my itch.
Let’s conduct a short poll. Who here knows, with 100% certainty, where they have stored PII? And our survey says – uhuh. With the explosion of DevOps, Shadow IT and CI/CD over the last few years, who amongst us is brave enough to say that processes and procedures have been uniformly followed, and of course I know where my data is? You do? Seriously dude? Liar, liar, pants on fire.
And that was where the itch came from. The GDPR, CCPA requirements are clear – to identify all PII held by the organization about an individual. But when you don’t even know what you have, how can you comply? The picture is incomplete where you only identify unknown PII where the sources are known. That’s tunnel vision – you can only see what you can see, and that is nowhere near the requirement.
To complete that picture, and to comply, you must be able to discover all currently unknown data sources. In order to remain compliant, you must be able to maintain that list (or we have the good old unknown sources over again). Nowadays, it’s so easy to spin up a system, to duplicate an existing system or to even migrate a system without disrupting the production environment, who’s going to know if I duplicate the CRM database for testing. (Yes, that one must also be covered for GDPR, CCPA even though it is a test system)
Enter some smart people with smart technology. Instead of looking at the existing inventory of data sources, they concluded that the correct approach is to look at network traffic. Through analysis of the traffic we can understand which network elements store PII, which share PII and those which process PII. And when someone spins up or replicates a system, it is automatically discovered – so no more unknown sources of data. And no more unknown PII inside unknown sources. Why? Because there are no more unknown sources. Better yet, it’s plug and play. No need to babysit, just plug it in and let it report back.
Not only will you enjoy a true picture of your PII, you will be compliant, and that itch has simply disappeared.
*Peter Steiner, New Yorker 1993, “On the Internet, nobody knows you’re a dog”. Isn’t it ironic to quote an iconic anonymity joke in a privacy blog?