Posted under: General
Title: Protecting What Matters: Defining Data Guardrails and Behavioral Analytics
This is the second post in our series on Protecting What Matters: Introducing Data Guardrails and Behavioral Analytics. Our first post, Introducing Data Guardrails and Behavioral Analytics: Understand the Mission we introduced the concepts and outlined the major categories of insider risk. In this post we define the concepts.
Data security has long been the most challenging domain of information security despite it being the charter of our entire practice. We only call it “data security” because “information security” was already taken. Data security cannot impede the use of the data itself. By contrast, it’s easy to protect archival data (encrypt it and lock up the keys in a safe). But protecting unstructured data in active use by our organizations? Not so easy. That’s why we started this research by focusing on insider risks, including external attackers leveraging insider access. Determining someone doing an authorized action, but with malicious intent is a nuance lost on most security tools.
How Data Guardrails and Data Behavioral Analytics are Different
Both data guardrails and data behavioral analytics strive to improve data security by combining content knowledge (classification) with context and usage. Data guardrails leverage this knowledge in deterministic models and processes to minimize the friction of security without still improving defenses. For example, if a user attempts to make a file in a sensitive repository public, a guardrail could require them to record a justification and then send a notification to security to approve the request. Guardrails are rule sets that keep users “within the lines” of authorized activity, based on what they are doing.
Data behavioral analytics extends the analysis to include current and historical activity and uses tools like artificial intelligence/machine learning and social graphs to identify unusual patterns that bypass other data security controls. They reduce these gaps by not only looking at content and simple context (as DLP might), but by adding in the history of how that data, and data like it, has been used within the current context. A simple example is a user accessing an unusual volume of data in a short period, which could indicate malicious intent or a compromised account. A more complicated situation would identify sensitive intellectual property on an accounting team device, even though they do not need to collaborate with the engineering team. This higher order decision making requires an understanding of data usage and connections within your environment.
Central to these concepts is the reality of distributed data actively used widely by many employees. Security can’t effectively lock everything down with strict rules to cover every use cases without fundamentally breaking business process. But with integrated views of data and its intersection with users, we can build data guardrails and informed data behavioral analytical models to identify and reduce misuse without negatively impacting legitimate activities. Data guardrails enforce predictable rules aligned with authorized business processes, while data behavioral analytics look for edge cases and less predictable anomalies.
How Data Guardrails and Data Behavioral Analytics Work
The easiest way to understand the difference between data guardrails and data behavioral analytics is that guardrails rely on pre-built deterministic rules (which can be as simple as “if this then that”), while analytics relies on AI, machine learning, and other heuristic-based technologies that look at patterns and deviations.
To be effective, both rely on the following foundational capabilities:
* A centralized view of the data. Both approaches assume a broad understanding of data and usage; without a central view, you can’t build the rules or models.
* Access to data context. Context includes multiple characteristics, including location, size, data type (if available), tags, who has access, who created the data, and all available metadata.
* Access to user context, including privileges (entitlements), groups, roles, business unit, etc.
* The ability to monitor activity and enforce rules. Guardrails, by nature, are preventative controls and require enforcement capabilities. Data behavioral analytics can be technically only for detection but are far more effective in preventing loss if they can block actions.
The two technologies then work differently while reinforcing each other:
- Data guardrails are sets of rules that look for specific deviations from policy, then take action to restore compliance with the policy. To expand our earlier example:
- A user shares a file located in cloud storage publicly. Let’s assume the user has the proper privileges to make files public. Since the file is in a cloud service, we also assume centralized monitoring/visibility, as well as the capability to enforce rules on that file.
- The file is located in an engineering team’s repository (directory) for new plans and projects. Even without tagging, this location alone indicates a potentially sensitive file.
- The system sees the request to make the file public, but because of the context (location or tag), it prompts the user to enter a justification to allow the action, which gets logged for the security team to review. Alternatively, the guardrail could require approval from a manager before allowing the file action.
Guardrails are not blockers because the user can still share the file. Prompting for user justification both prevents mistakes and loops in security review for accountability, allowing the business to move fast while still minimizing risk. You could also look for large file movements based on pre-determined thresholds. A guardrail would only kick in if the policy thresholds are violated, and then use enforcement actions aligned with the business process (like approvals and notifications) rather than just blocking activity and calling in the security goons.
- Data behavioral analytics use historical information and activity (typically with training sets of known-good and known-bad activity) which build artificial intelligence models identifying anomalies. We don’t want to be too narrow here in our description since there are a wide variety of approaches to building models.
- Historical activity, ongoing monitoring, and ongoing modeling are essential no matter the mathematical details.
- By definition we focus on the behavior of the data as the core of the models, not user activity, representing a subtle, but critical distinction from User behavioral analytics (UBA). UBA tracks activity on a per-user basis. Data behavioral analytics (since the acronym DBA is already taken we’ll skip making up a new TLA), instead looks at activity at the source of the data. How has that data been used? Which user populations? What types of activity happen using the data? When? Not that we ignore user activity, but we are tracking usage of the data.
- For example, we are answering the question “has a file of this type ever been made public by a user in this group?” UBA would ask “has this particular user ever made a file public?” We believe focusing on the data has the potential to catch a broader range of data usage anomalies.
- To state the obvious, the better the data, the better the model. As with most security-related data science, don’t assume more data results in better models. It’s about the quality of the data. For example, social graphs of communications patterns among users could be a valuable feed to detect situations like files moving between teams not usually collaborating. That’s worth a look, even if you don’t want to block the activity outright.
Data guardrails handle known risks and are especially effective in reducing user error and identifying account abuse resulting from tricking authorized users into unauthorized actions. Guardrails may even help reduce account takeovers since the attackers wouldn’t be able to misuse the data if the action violated a guardrail. Data behavioral analytics then supplements the guardrail for those unpredictable situations or where the bad actor will try to circumvent the guardrails, including malicious misuse and account takeovers.
Now you have a better understanding of the requirements and capabilities of data guardrails and data behavioral analytics. In our next post, we will focus on some quick wins to justify including these capabilities in your data security strategy.
*** This is a Security Bloggers Network syndicated blog from Securosis Blog authored by [email protected] (Securosis). Read the original post at: http://securosis.com/blog/protecting-what-matters-defining-data-guardrails-and-behavioral-analytics