5 Ways to Automate Data Privacy Management

by Priyadarshi Prasad on May 9, 2022

In my previous article, I discussed why data privacy management in its current form, which is all about filling out form after form, is unsustainable. In her recent address at the IAPP Global Privacy Summit in Washington, DC, FTC Chair Lina Khan characterized the prevalent approach to data privacy as “procedural protections,” and I couldn’t agree more. Tim Cook, at the same conference, talked about privacy as the “most essential battle of our time.” In other words, we are fighting the most essential battle of our time by asking our privacy and security teams to fill out more and more forms! There is no way we can lose!

Throwing paper at a problem is a very CYA approach. The CYA approach works as long as the going isn’t rough (read: No breaches, no accidental sensitive data exposure, no malicious actors within your or your partners’ teams and so on). However, in the absence of the right tools, this is the best our privacy and security teams can do. Thankfully, (for them and for all of us who care about data privacy) there’s a better approach powered by modern solutions. What are some of the characteristics of these modern solutions? How can you best take advantage of them to safeguard your sensitive data privacy? Let’s take a look.

Automate Data Discovery

If you were taken in by the promise of data loss prevention (DLP) solutions only to get bogged down and burned by false positives and false negatives—your experience is, unfortunately, all too common. These pattern-matching solutions of yore cry wolf so many times that when the real wolf shows up, no one has the time or the inclination to take a look. DLP solutions that try to do data mapping, unfortunately, suffer from the same challenges. However, just because DLP solutions masquerading as data mapping solutions failed, the answer is not to simply start conducting interviews to do data mapping. Considering the rate at which data keeps changing, a manual data map gets old even before the exercise is complete.

A good automated discovery solution should be able to handle different types of file formats including text, pdf, HTML, JSON and even images. It should be able to scan your unstructured and structured content and give you a 360-degree view of all the sensitive data you have within your organization. To avoid the pattern-matching issues that plague DLP, your solution should, ideally, not just look at content but also the context (natural language understanding) to assess the presence or absence of sensitive data. And because no two customer environments are ever the same, the ability to keep learning and keep getting better is key for a solution you can trust. This is where artificial intelligence (AI) can come in handy.

“Hu is coming for dinner.” Who?

An automated solution that just focuses on what—as in, what sensitive data you are carrying—is a good starting point, but won’t help you when all the data privacy regulations focus on individuals. The question that should be asked is, “What data you are carrying about me?” and the follow-up: “Please delete all my data.” If you know you have a million Social Security numbers but the solution doesn’t tell you whose SSNs they are, your manual work will continue.

Look for solutions that have the ability to link all the sensitive data they find across your data repositories to the identities they belong to. These identities could be your customers, your employees, your partners and so on. The important thing is when someone asks you what information you have about Jane Doe, you have a single comprehensive view of all data present in your systems about them, be it across structured data repositories, data lakes, file systems, emails, messaging platforms, CRM systems or anything else.

Automate Enforcement

I know of no organization where the privacy and security teams are not understaffed and overworked. Too many things beyond their direct control keep them on their toes all the time. For such a team, the prospect of implementing a solution that will generate more alerts, creating more issues for them to look at, is not appealing (to put it mildly).

When it comes to data privacy solutions, look for those that allow you to set up policies and let the solutions enforce those policies. Here is a simple example: Slack. All kinds of sensitive information is exchanged on Slack. The challenge is not just the exchange, it is that once exchanged, that information will stay there forever, visible even to future members or external members that get added in the future.

As a privacy officer, if you deem that certain types of data should never be shared on Slack, in addition to imparting training, let your data privacy solution enforce your policies automatically. For example, that all SSNs belonging to your customers (and/or employees) should automatically get deleted from Slack after an hour.

Automate Safe Sharing

As a privacy and security professional, perhaps you have thought about locking access to all sensitive data to anyone and everyone at some point in time. Unfortunately, as tantalizing as that sounds, you know your team members in sales, marketing, finance, customer success, etc. all need access to that sensitive data to effectively do their jobs.

A solution that focuses on data privacy and not just procedures or patterns should be able to help you provide access to different audiences based on their needs. For example, a file containing a patient name, a medical record number (MRN), date of birth (DOB), ICD9/ICD10 codes and clinical research report should be shareable with a pharmaceutical company with all PHI (name, MRN, DOB) redacted. Precise attribute-centric redaction can help you mask information based on the audience the data is being shared with.

Don’t Forget About Customer Volunteered Data

In most organizations, the source of customer-sensitive data is often a business or a marketing process; for example, order/payment processing, website traffic, events, etc. It is normal to look at your CRM systems or data lakes as the key places where sensitive data resides. However, customer support and dev tools increasingly contain a lot of sensitive data. For example, it is not uncommon for organizations to use Zendesk to manage their support. Their customers, while opening their cases, can sometimes share sensitive PII/PHI data which then persists in those systems long after the case has been closed. Similarly, sometimes support cases make their way to engineering teams and often the logs get uploaded to Jira. Again, these logs may contain sensitive information and will remain there forever.

As a privacy professional, you obviously don’t want to get in the way of a superior customer support experience. However, you can still contain unwanted sensitive data sprawl with automation. Consider solutions that will automatically redact sensitive data within systems like Zendesk and Jira after a ticket has been closed. This is good for your organization and your customers will appreciate your thoughtfulness, as well.

The last two decades were all about grabbing as much data as organizations could about their customers. Then, that data used to go everywhere, unconstrained; it’s like pouring ink into a lake! Thankfully, our collective consciousness is forcing organizations to treat sensitive data properly. If you are a privacy or security professional, you are finally hearing the masses demand of organizations what you have been saying for years. Now it is time to help your organization automate data privacy management; helping your business not only be on the right side of the law but also on the right side of your customers.