Security Leaders Voice Concerns Over Dark Data

Dark data, the data that organizations are unaware of but which can still be highly sensitive or critical, is a major worry, with 84% of organizations “extremely concerned” about it, according to a BigID survey of 400 enterprise technology leaders.

The survey also revealed that eight out of 10 organizations consider unstructured data the hardest to manage and secure and found more than 90% of organizations struggled with enforcing security policies around sensitive or critical data.

Meanwhile, interoperability continues to be a critical factor for most when considering future security investments, and more than a quarter of organizations think their data loss prevention (DLP) tools fail to fulfill their data protection needs.

Dark Data a Glaring Concern

Dimitri Sirota, BigID CEO, said the simple answer to why dark data is such a major concern for organizations is because you can’t protect what you don’t know.

“Data has exploded in sheer growth over the years—and traditional classification techniques mean you have to know what you’re looking for and what pattern it follows in the first place,” he said. And with dark data, if you don’t know it’s there or what it is, you can’t secure it.”

He added that as data migrates to the cloud and as data in motion, data lakes and more grow in popularity and use, it becomes more critical than ever to be able to automatically identify all data—especially dark data—and then take steps to secure it.

Unstructured data has long been a challenge; by 2025, IDG predicted that there will be 163 zettabytes of data in the world—and estimated that 80% of that data will be unstructured.

Adding to that volume challenge, unstructured data often contains all sorts of confidential or critical data—like intellectual property, business and financial data, customer IDs and more.

“On top of that, it often takes a long time to scan unstructured data, and it can be difficult to know where to start,” Sirota said. “It’s particularly challenging not just because of data sprawl, but because to enforce security policies, they need to be able to know their critical data in the first place.

That means being able to scan, classify and inventory sensitive data of all types. The definition of what’s critical, sensitive and regulated continuously evolves; data volume continues to grow and it’s typically spread out across data stores, on-premises and in the cloud.

“In order to enforce security policies, organizations need a comprehensive and accurate data inventory as a foundation and, from there, be able to take action on it,” Sirota explained. 

Fasten Your Seatbelts

Mohit Tiwari, co-founder and CEO at Symmetry Systems, added that dormant data is a critical theme for organizations, which could be referred to as “dark” or “shadow” data that organizations don’t know about. Regardless of terminology, it comprises a mountain of known data that is not being used.

“Dormant data is a risk if any path to it is compromised; this could be via identity, application or even cloud provider breaches,” he said. “Hence, it is worth finding and placing detection ‘seatbelts’ around dormant data, while setting up a longer-term process to tighten down permissions to dormant data.”

Unstructured data, when addressed in isolation, is hard to manage and secure. But structured data stores are copied over, analyzed and link to unstructured data, so treating the problem of data security holistically across structured and unstructured data is key to reducing data risk.

Tiwari noted that because data is created by product teams, analysts/BI teams and across every business unit, it is hard for a security or a data governance team (CISO or CDO offices, respectively) to locate sensitive data and fasten those seatbelts around it.

“The biggest risks and opportunities in data security are to map out data objects across an organization and to bring data object and identity governance into a tight loop,” he said. “Access policies around data and identities are more durable and can be used to drive permissions on the cloud and workloads.”

Sirota said looking ahead into 2022 and beyond, there is more data than ever; businesses run on that data and added that what data is considered sensitive, valuable and regulated continues to evolve. This is both the challenge and the opportunity.

“The adoption of the work-from-everywhere model, for example, creates an avalanche of even more sensitive data than ever before—sensitive data that needs to be managed and protected,” he said. “And it’s hard to do this across silos: Organizations need to be able to address the fundamental ability to know their data at scale, and across their entire tech stack.”

Nathan Eddy

Nathan Eddy is a Berlin-based filmmaker and freelance journalist specializing in enterprise IT and security issues, health care IT and architecture.

nathan-eddy has 250 posts and counting.See all posts by nathan-eddy