Credit: Google News
Cybersecurity isn’t easy.
There are now so many vulnerabilities every organization has to deal with that most IT teams have zero chance of fixing them all. An average machine may have 150-200 vulnerabilities open at any given moment. These efforts get even more complicated when thinking about the motivations for hackers, what systems they might target and the way these all combine to create anomalous pathways to an intrusion. All the while, the long list of vulnerabilities keeps stacking up — to the tune of 20-40 new vulnerabilities disclosed daily, as your security team rushes to fix and patch at random.
Organizations simply cannot reduce their risk and improve their security posture without having some way to predict, ahead of time, which threats and vulnerabilities will actually lead to an attack.
I’ve been thinking a lot about this recently as I reflect on some of the recent developments the industry has made in machine learning. We’ve seen quite a few vendors recently rely on machine learning security solutions. In reality, I think they’ve fallen short of making systems that actually solve the problems a business faces, and so I want to explain the depths of what machine learning is and is not.
I want to detail three key lessons I’ve learned over the years as I’ve developed algorithms and machine learning systems for cybersecurity.
Lesson No. 1: Accuracy Isn’t Necessarily Useful
An accurate model reflects the underlying data. But is this really useful? Imagine you are responsible for 100 machines. One of them has an exploited vulnerability. You don’t know which one. If every day you tell the board “nothing is at risk,” you’d be right 99% of the time. Your guess is actually very accurate, but this doesn’t add value when it comes to cybersecurity. Extend that example to an enterprise. Exploits are relatively rare, and so the predictive power of your model may appear right more often than it actually is; but is it helping to predict a potential attack before it occurs? A thoughtful machine learning approach needs to reflect this.
Lesson No. 2: Coverage And Efficiency Are More Useful Metrics
Coverage (or recall for those versed in machine learning) measures the completeness of remediation. Efficiency measures the precision of remediation. Those definitions are perhaps a bit airy. So it’s important to understand what they mean in practice.
If your organization has 500 vulnerabilities, it isn’t necessarily the case that there is an exploit tied to each of those vulnerabilities. There may be an exploit for just 250 of them, and hackers may only deploy just 75 of them. You only have the manpower to fix 250. You need to make a decision about which ones to fix. If you had 100% efficiency, you would patch precisely those 75 that are actually posing a risk to your organization.
Now, consider the fact that nobody is 100% precise. Some of your predictions will be wrong. In fact, your efficiency may sit closer to 35%. So, you make more than 75 predictions. If you patch 250 vulnerabilities, you will have broader coverage, but the large amount of wasted effort means a decline in efficiency. In effect, you have increased the likelihood that you have predicted the vulnerabilities that will be exploited. Your coverage has increased while your efficiency has declined. When it comes to predicting security events, there is a tradeoff between coverage and efficiency. Such a tradeoff actually exists in all machine learning models, from legal document review to cancer screening to search engine results.
Lesson No. 3: Subject Matter Expertise Is Critical
I could make a terrible model from the same dataset that I can make a great one from. It might take months to tell the difference. Put another way: If I predict a vulnerability will be exploited, it might take a long time for me to be proven right. That’s where working with cybersecurity experts plays a role. Leaning on industry expertise has helped refine models immensely by being able to look at a prediction and evaluate whether it makes sense contextually. When it doesn’t, we can work together to figure out how to adjust the algorithm to correct it.
There has been a lot of discussion recently about predictive models. But it’s unclear, and it may take too long to discover, if their predictive power is actually effective and if it actually saves time.
Imagine, for example, that a company designs a model to signal a red flag for every vulnerability with a published exploit. It’s true that the existence of an exploit increases the probability that it will be used in an attack, but our research shows that countless exploits are never used. A model like this would fall short in the long term because there would likely be far too many false positives, forcing security teams to do hours and hours of work that might not result in real security improvements
Some exploits are never used in the wild because a previously released patch might address that vulnerability or because the exploit targets systems that tend not to contain high-value data. There are conditions under which some exploits aren’t used. A truly predictive system understands that.
The proof of machine learning and predictive power is in its coverage and efficiency. Machine learning involves working with variables that are dynamic. It takes a long time (years) to fully develop the algorithm and understand the way in which these variables interact with one another. It may take longer to find out if the system actually works.
Credit: Google News