Home » Cybersecurity » CISO Suite » Are Cybersecurity Intelligence and Security Metrics Statistically Significant?

Are Cybersecurity Intelligence and Security Metrics Statistically Significant?

by C. Warren Axelrod on November 11, 2019

It is customary to begin
an article on cybersecurity with some statement about the exponential growth of
threats, attacks, vulnerabilities, etc. I’m no different. It seems like a
reasonable, generally accepted thing to do. So, I was somewhat surprised when someone
pushed back on such a statement of mine, requesting support of that claim—effectively
saying “Prove it!” “Okay,” I responded, “I’ll go to the usual sources.” These
include the highly respected and frequently-referenced Verizon DBIR (Data
Breach Investigations Report).

The DBIR (2019 version available
at https://enterprise.verizon.com/resources/reports/dbir/)
has been published annually for the past 12 years. The 2019 edition was based
on information from 73 data sources, with 66 of them external to Verizon. The
analysis was based 41,686 security incidents that included 2,013 confirmed data
breaches, although 50,000 botnet-related breaches were “removed from the
general corpus.” Pretty impressive, huh?
Well yes, compared to what else is out there. But what percentage of total
security incidents were covered in the survey … 50%, 10%, 1% or less? I
suspect that the actual percentage is on the very low side based on my being
aware of a number of medium-to-large breaches that were never announced
publicly. It’s not that anyone was
hiding anything—it’s just that the breaches were never reported by the press or
the various statistics reporting outfits. Then, add to that the breaches that
have never been discovered. That has to be a huge number, too. I base that presumption
on reading so many reports in which victim organizations didn’t discover that
they had been breached until alerted by a third party, such as the FBI,
customers, business partners, industry regulators, etc.

If you add it all up, it
is very likely that reported samples are actually too small to yield statistically
significant results. If that is the case, are conclusions drawn from the
results realistic? There are guidelines as to what percentage of a population
needs to be sampled in order to come up with meaningful results. Whether or not
the various reports on size and cost of threats, data breaches,
vulnerabilities, and the like, have used appropriate sample sizes cannot be
determined if you don’t know the size of the total population—and we usually don’t!
The “2018 Cost of Data Breach Study” performed by the Ponemon Institute for IBM
lays it out pretty clearly when it lists the limitations of the study, as
follows:

Non-statistical results—The
data were not collected in a scientific manner and therefore cannot be used for
statistical inferences [Then how can you trust the results?]

Non-response—The
data were collected on a small sample without testing for non-response bias
[Perhaps the sample was too small for the results to be significant]

Sampling-frame bias—The
sampling-frame was believed to be biased towards companies with more mature
privacy and security programs [This is the self-selection issue]

Company-specific
information—Since the information collected was
sensitive and confidential, company-identifying data were not collected [Is
that a limitation?]

Unmeasured factors—To
keep the interviews simple and concise, other important variables, such as
leading trends and organizational characteristics, were omitted with the
consequence that significant variables may have been missed [There is always a
risk of collecting inappropriate data if a hypothesis to be tested isn’t developed
and presented ahead of time]

Extrapolated cost
results—It was possible that the respondents did not provide
accurate and truthful responses and that the cost extrapolation methods may
have introduced biases and inaccuracies [If the data may be untrue and
inaccurate, what’s the value of the research?]

Once you factor in the
above disclaimers, there is a question as to whether the report has any value
whatsoever. Even worse, executive decisions are presumably made somewhat based
on the results in these reports. But all is not lost. On page B1 of the October
19-20, 2019 Wall Street Journal, John D. Stoll wrote an article “‘Feel
the Force’: Gut Instinct, Not Data, Is the Thing.” Stoll describes how many
senior executives make decisions base in intuition with minimal use of data
analytics. Perhaps the value of the results of these surveys could be that the
reports help decision-makers in forming instinctive and intuitive images that
lead to better decisions.

However, even if researchers
were to follow standard sampling and statistical decision-making norms, there
are those questioning the very foundation of statistical analysis, as in an
article in the October 2019 issue of Scientific American by Lydia
Denworth with the title “A Significant Problem: Standard scientific methods are
under fire. Will anything change?” After describing the weaknesses of various
approaches to determine statistical significance, Denworth concludes with the
following quotation on statistical analysis from Jerry Neyman and Egon Pearson:
“The tests themselves give no final verdict but as tools help the worker who is
using them to form his final decision.”

This raises questions as
to whether the sample size is big enough and the results are significant enough
to justify decisions made by cybersecurity and risk professionals. I think that
these reports from Verizon, Ponemon, etc., provide useful intelligence to guide
decisions as long as the decision-makers realize the limitations of the
research.

And what about security metrics?
At the macro level, sharing one’s security metrics with others is minimal since
most organizations don’t want to disclose what is happening security-wise
within their networks and systems. Some entities, such as MSSPs (managed
security services providers) running SOCs (security operations centers) and
ISPs (Internet services providers) get to see malicious activities across their
population of clients and, although they do not publish this information, they
are able to use it to help protect other clients if one or more are attacked.

At the micro level,
organizations are confronted with the same issues for their security metrics as
with any data gathering and analysis approach. Organizations are already
drowning in the volumes of logged data spewed out by all the monitoring devices
and software already deployed. But how representative are the data, and can
those data be analyzed quickly and accurately enough to assist real-time
decision-making? It’s just not possible or practical to collect all the data,
so we must ask whether what we have are sufficient and useful. Here again,
metrics do provide some measure of situation awareness, which, in turn, can
make for decisions that will at least protect against identified attacks and
vulnerabilities. However, it’s what you don’t know—and what the metrics don’t
tell you—that can hurt the most.

Perhaps these issues help
explain our difficulty in stemming the tide of security incidents and data
breaches since, if we don’t know enough about what is happening, then it
becomes virtually impossible to come up with comprehensive, effective protective
measures. Granted current intelligence and metrics do assist in situation
awareness and incident response, but can we improve the current state of
affairs by increasing sample size in order to get a better picture of what is
significant? Or should we be gathering different data altogether, as I suggested
in my article “Accounting for Value and Uncertainty in Security Metrics,” in the ISACA
Information Systems Control Journal, November 2008?

All in all, the
limitations and biases of cybersecurity intelligence and security metrics give
one pause as to whether these reports have any value other than shifting the
responsibility for determining the basis for cybersecurity decisions to these
third parties. Come to think of it, that’s a win-win for both the creators and
devourers of the reports … the creators are compensated handsomely for their
work, and the decision-makers get to blame someone else for their decision mistakes.