The Future is Autonomous Data Protection
Behind the scenes of every organization, a constant hum of data access control decisions act to protect against data loss and breaches. In theory, each decision is simple–compare the data to the requestor’s role and then grant or deny access. In practice, access control is an InfoSec professional’s nightmare; it’s one of the most complex, error-prone security processes and almost no one is satisfied with the current situation. Emerging purpose-based access control frameworks have the potential to meet these challenges–but only with suitable data security solutions that can provide contextual insights into content and risk.
Existing Approaches
Data loss prevention (DLP) tools, once the de facto standard for data security, rely on rules-based filtering to protect data as it moves. Once effective before the rise of remote working and the cloud, these tools are now less valuable due to prolific sharing by end users, distribution of data across on-premises and cloud storage and growing costs associated with maintaining DLP rules and policies.
Another approach–manual data classification by users–has different insurmountable challenges. Users struggle to consistently and accurately classify the data they own and use, and unowned or inactive content is in an even worse predicament. Often ignored by users with better things to do, protecting stale content is still vitally important as it frequently contains sensitive or regulated data.
Folder-based controls are another popular data security option. But like user-driven classification schemes, folder-based controls have two key weaknesses–they rely on end user vigilance and provide only the coarsest insights into risk context and severity. Now that remote work is the norm, content is everywhere and links are the de facto way to share data. Semantic context is critical to understanding risk. The time for folder-based risk management has come and gone.
Firewalls, intrusion detection systems (IDS), and other perimeter-based data controls are similarly inadequate for data protection in today’s complex environments. To a great extent, there is no such thing as a perimeter any longer. The recent verdict in the Capital One data breach case is an excellent reminder that data goes where it will go, no matter the official policy. In that 2019 case, a former insider discovered 100 million Capital One customer records on a cloud server and stole them.
In theory, access controls and zero-trust policies should close the gap, and most organizations already limit access using approaches based on user roles. But limited insights into the content undermine even the most stringent data protection effort. Unfortunately, the accurate, granular content and risk insights needed for adequate data protection are hard to come by.
Another problem? The burden of data protection falls on InfoSec team members, who are under immense pressure to prevent data loss and breaches. There are not enough skilled professionals to handle the workload. To make matters worse, they’re fighting their tools when they try to use available blunt objects to solve fine-grained problems.
Solving the Data Protection Problem
The evolution toward purpose-based access control has the potential to resolve a myriad of data management problems and be agile enough to meet changing regulatory demands. But purpose-based control will require increasingly sophisticated data and risk insights. Simple classification frameworks, which sort data into a few broad buckets, can’t deliver the dynamic, contextual picture needed for purpose-based control. This work is too overwhelming for today’s stretched-thin IT teams. Automation is essential.
Future-ready access control solutions use natural language processing (NLP) to identify and illuminate content and its meaning. In contrast to traditional approaches, which use pre-defined rules to sort data into pre-existing buckets, an AI-based system generates meaningful insights based on the data itself to create new categories as needs and requirements change. And the value of NLP’s autonomous operation cannot be overstated. Automation is a must-have for forward-thinking data security approaches.
Any solution, including those based on AI, must be broadly integrated. It must reach every system and data storage location, including legacy systems, cloud volumes, shared drives, and more. It also must span structured and unstructured data, such as PDFs, documents, and spreadsheets.
This approach supports comprehensive data access governance as it evolves towards purpose-based control. It generates the detailed content insights needed to meet the principles of least privilege. It can also dynamically adapt access rights based on changing user roles and other conditions (such as changes to regulatory frameworks).
AI-based data access governance also benefits end users by giving them a sense of confidence when it comes to how they interact with and share data. Employees may have been confused about data security policies. Now, with AI-based data access governance, the relevant sharing policies are defined and interpreted for them.
Data protection, never easy, has become more challenging as data becomes more voluminous, diverse, and widely distributed. Existing data protection practices are revealing themselves to be inadequate. What’s needed is an autonomous data access control approach that leverages advanced technologies like AI and NLP to provide dynamic content insights and deliver risk assessments informed by semantic context. This way, InfoSec teams are not overstretched, and end-users feel empowered to share data without undue concern about security.