Every effective PII protection effort addresses three critical imperatives – data discovery, access governance and risk mitigation. IT teams grappling with privacy mandates need to consider these factors across their unstructured and structured data contexts. And while regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) outline expectations for handling personally identifiable information (PII), they aren’t much help when it comes to the tactics you need to succeed. Let’s take a look at some effective strategies – and how they differ – across structured and unstructured data.
A typical organization manages unstructured data in more than 10 million files containing everything from marketing and sales information to client contracts, to employee insurance and human resources information. Discovering PII in these files remains one of the toughest data security challenges of our time, and it’s easy to understand why. It is, on the other hand, a bit harder to understand why structured data discovery can also be difficult.
Structured databases should provide an easy map to PII – but database designs often predate modern privacy regulations and, as a result, few production databases were designed with privacy in mind. Sensitive information is often scattered across different databases, in different tables and in different fields. Sometimes, PII is duplicated across tables or in unrelated databases. Finding it all can be tougher than you think, but it’s a critical first step. PII protection starts with PII discovery.
Fortunately, emerging automated PII discovery tools can help find PII in both structured and unstructured data. In the unstructured data world, rules and end-user classification programs have long been used in an attempt to identify PII – but they haven’t been effective or manageable. Finding PII across an organization’s databases, on the other hand, is a question of determining which databases and tables contain regulated data, identifying duplications and accessing risks. Recent artificial intelligence (AI) innovations show promise in automating discovery for both structured and unstructured data.
A clear and complete understanding of who can access PII and how they can do it, is the key to understanding risk and implementing mitigation strategies. But these notions of “who and how” differ quite a bit for structured and unstructured data. For example, large-scale databases supporting web applications, such as those handling e-commerce operations, typically connect those applications to databases via a handful of service accounts. Tracing who has access isn’t usually a problem. Increasingly, API connections to databases extend access, sometimes outside the organization itself. It goes without saying that, even though it may be simple to determine who has access, each connection needs careful oversight.
Cataloging access for unstructured data is far more complicated. Empowered end users make highly consequential access control decisions, and those decisions are dispersed and ungoverned. Inappropriate sharing with external or personal emails, link sharing (especially unprotected or non-expiring links), files stored outside of designated locations and unclassified files that slip by data loss prevention (DLP) services are just a few ways data can be lost. Understanding and managing access in this context is an enormous governance challenge.
As with the data discovery process, recent innovations in AI can clarify who has access and whether PII access is appropriate. Replacing legacy approaches that rely on file locations, pattern-matching rules or end user document markup, AI can assess risk based on document content and the security practices in use for similar content.
Security professionals, now armed with a clear understanding of what data they have and where the risks are, can develop more effective PII protection strategies. The tactics for protecting structured and unstructured data are, again, quite different. Here are some key tips for structured data risk mitigation:
There are emerging tactics to also consider for unstructured data:
Compliance is a complex topic; each situation is different for a particular data and regulatory environment. Having a clear understanding of how to discover, assess and protect structured and unstructured data, and their differences, provides a foundation for an effective and manageable program to protect critical PII and regulated data.
Are you a FANFSA fan? The White House isn’t. It says, “It threatens national security.”
DataDome's SOC 2 Type 2 compliance has been renewed for another year, further underlining that our security controls for customer…
Audit evidence lies at the heart of cybersecurity audits and assessments, providing tangible proof of an organization’s adherence to cybersecurity…
What are Stale Accounts in Active Directory? Accounts that have not been used in the past six months and are…
It’s rare to see a data breach study observers call a “mixed bag.” Normally, reports on data breaches are grim,…
The CyberSaint team is dedicated to providing new features to CyberStrong and advancing the CyberStrong cyber risk management platform to…