Data Classification: Protecting Sacred Data in the Cloud

Before you can secure your data, you have to know your data. It sounds trivial, but it is far from it. Data classification not only enables you to understand, categorize, label and protect data today, but it also future-proofs your business by preparing you for new datasets, data structures, regulations and compliance frameworks, as well as any new security technologies you may implement down the road. Without proper classification, there can’t be proper protection.

“Sacred data,” also known as an organization’s crown jewel data, are the information assets that are of greatest value to a business and would cause the largest amount of damage if compromised. This data needs to have the heaviest and most restrictive controls applied to it; maximum protection with minimal access.

McAfee’s latest report, “Enterprise Supernova: The Data Dispersion Cloud Adoption and Risk Report,” shows that 26% of files in the cloud contain sensitive data, an increase of 23% year over year. Organizations understand the need to protect their sacred data. And each organization has its own broader definition of sacred data. Yet, within the organization, there are often conflicting views within the business units themselves as to what qualifies as sacred data. This can easily lead to the case of taking a scattershot approach to security and oversecuring all data (that can be found) as if all data are equal. This unnecessarily hinders user access and network performance and is a symptom of larger security misconfiguration issues.

“Companies have been sold this idea that all their data is sacred, and if they don’t yet know why, then they should hold on to it, or hoard it until they figure out its use,” said Rich Mason, president and CSO of Critical Infrastructure LLC. This approach, he said, works for some time—that is, until a business gets a huge bill for cloud storage services or finds that they’ve done such a poor job categorizing their data that they can’t figure out how to get value from it.

So how do you build a plan of action to secure your sacred data and improve your overall security posture?

Identify your sacred data: According to Mason, “Too many in the security industry let the regulators and auditors define what sacred data is.” Intellectual property, strategic plans, customer data, partner contracts, pricing strategies—these are unique assets that may not fall under a baseline compliance framework. As your business is built from these assets, they could lead to an extinction-level event if compromised. Think about what would happen to KFC if its secret “11 herbs and spices” was leaked. Global chaos. Just because you are able to prove compliance doesn’t mean that all of your most valuable data assets are secure.

Discover the location of your sacred data: “Sacred data is tribal,” Mason said. “You’re likely going to find that no two executives are aligned on the top 10 crown jewels.” This means you need to set about discovering sacred data systematically, not tribally. “A common mistake that many new CISOs make is thinking that they are going to be handed a playbook of all the company’s secrets on day one.” You have to find out what data lives where and who owns the data and know what data is flowing in and out of your business at all times. This requires stepping away from the computer screen and getting in front of key stakeholders to find what data is sacred to them. Then you can connect their tribal knowledge and treasures and map them to a holistic security framework that can be managed centrally.

Classify data according to its value: Every organization should create classification categories that make sense for its needs. The simplest and most common method I’ve seen in use is a green, yellow and red color model. Security controls on data are color-coded and scaled depending upon the value and criticality of that data; it’s quick, flexible to a point and easy enough to understand. An example could be:

  • Green data: A quick reputation hit—likely related to publicly available data or confidential company records.
  • Yellow data: Issuing a breach notification because sensitive customer data has been exposed.
  • Red data: Major news cycle, extreme fines, loss of customer confidence and trust, potential loss of more than 50% of revenue, all the way to a company extinction-level event.

Secure data according to its classification: By looking at the consequences a data breach by classification would cause, you can work backward to employ security controls that are directly aligned to minimize the probability of those consequences happening. Starting with consequences also makes it easier to ask the right questions to classify the data and discover security gaps. For example:

  • What systems are processing red data?
  • Are there automation processes that inappropriately move red data over to systems designed to only handle yellow or green data?
  • Are employees able to access systems that store red data when they shouldn’t have the permission to do so?

Though data classification tends to start with the goal of reaching compliance, that’s just the beginning. “Security through and through is a people process. There needs to be a constant, ongoing and collaborative dialogue taking place between all parties involved and across the business,” Mason said. Understanding who accesses sacred data and within what context are as important as knowing what data is being secured. The value of data changes over time and your data classification system should continuously evolve as well.

Featured eBook
The State of Cloud Native Security 2020

The State of Cloud Native Security 2020

The first annual State of Cloud Native Security report examines the practices, tools and technologies innovative companies are using to manage cloud environments and drive cloud native development. Based on a survey of 3,000 cloud architecture, InfoSec and DevOps professionals across five countries, the report surfaces insights from a proprietary set of well-analyzed data. This ... Read More
Palo Alto Networks

Grant Wernick

Grant Wernick is the co-founder & CEO of Insight Engines. Insight Engines is a leader in natural language search technologies. The company builds products to augment human intelligence with machine intelligence via their patented NLP and ML technology. Insight Engine's flagship product Insight Investigator enables people, no matter how technical, to ask questions of their log data and get answers in seconds. Utilized by the Fortune 500, as well as some of the largest government organizations, Insight Engines is backed by August Capital, Splunk, Google Ventures, and DCVC.

grant-wernick has 3 posts and counting.See all posts by grant-wernick