Managers and operators use these models and predictions to make decisions that enhance value. Examples include minimising blasting costs, reducing fuel consumption in haulage vehicles, increasing equipment uptime and maximising throughput and recovery rates.
However, for every machine learning success story, there are many more failures. These include proofsofconcepts and pilot projects shelved because they didn’t deliver as expected. And in many cases, managers can’t explain why.
Why do so many machine learning projects fail? BCG’s work with clients suggests three reasons:
 Flawed technology: The algorithms developed simply don’t work correctly.
 Insufficient value: Initiatives are focused on solving problems that are interesting but that don’t help a company unlock new value. Unable to justify their cost, managers deprioritise the initiatives.
 Inadequate change management: Users of machine learning tools aren’t brought along the change journey. Consequently, they don’t trust the tools and are reluctant to use them.
Most mining executives and practitioners well understand valuecreation and changemanagement issues. But the culprits behind flawed technology are less understood.
Therefore, we focus here on the technical reasons behind machine learning failures and describe an emerging technique – Bayesian Learning – that could help miners get more from this technology.
Limitations of classical approaches to machine learning in mining
Despite machine learning’s usefulness, this technology also has some inherent limitations. Most notably, machine learning algorithms developed through classical approaches, such as neural networks and random forest, typically require huge volumes of data, often generated over long periods of time, to discern relationships in data and to make useful predictions.
In mining, managers face numerous situations where there’s not enough relevant data available to enable an algorithm to learn about relationships in the data. To illustrate, perhaps an operation hasn’t been running for very long – such as a newly installed fragmentation camera at a SAG mill – and thus has generated little historical data. Or, maybe a key element in a longrunning operation has changed in ways that make the historical data irrelevant. For example, a sensor has been recalibrated.
What’s more, it is hard to understand how a machine learning model makes its predictions, so models are often criticised as ‘black boxes.’ Managers can ‘shine some light into the box’ by experimenting with the model; for example, observing how a prediction changes as the model’s inputs are changed. But getting a complete picture of how a model works is extremely difficult. This opacity can make it hard to reconcile the model with first principles; that is, to prove that the model behaves in a manner consistent with how the process in question actually works in the real world, including its physics. Without the ability to make this reconciliation, trust in the model erodes, along with adoption.
This inability to ensure that models adhere to first principles also explains why algorithms developed through classical approaches require so much data: they need to learn everything, including equations from physics that are already known to be true. The upshot? It’s an inefficient way to build a model.
Lacking enough relevant data for the classical approaches to work, many mining companies have launched machine learning initiatives that have failed to deliver value. These failures come with a high price tag. Largescale machine learning programs can cost in the multiple millions, and failures may spawn skepticism about the technology’s usefulness.
Is there a way for managers to exploit their knowledge of first principles and only use the data required for algorithms to learn what managers don’t know about a process, and thus build better models? Yes, and the technique for doing so is Bayesian Learning.
Enter Bayesian Learning
Bayesian Learning combines useful concepts from artificial intelligence (such as empirical modelling, or learning from data) with classical engineering techniques (applied physics). In simplest terms, Bayesian Learning lets data scientists encode into an algorithm prior understanding about how a process works in the actual world. This prior understanding generally comes from operators and engineers, who typically have a reasonable, but not precise, understanding of the process that a company wants to model.
Because prior understanding is used, Bayesian Learning requires much less data than classical machine learning does. It is thus far more efficient at extracting insights from the data, as the algorithm does not need to learn things that are known to be true from physics. Bayesian Learning is also completely ‘white box’: predictions can be explained and users don’t have to reconcile the model with first principles.
The concept of Bayesian Learning has been around for some time. However, it is extremely computingintensive. With the increased availability of lowcost cloud computing, implementing this technique at scale has only recently become more practical. Indeed, advances in computing power have enabled use of many machine learning applications, including neural networks, which were theorised many decades before the necessary computing power was available.
Bayesian Learning in action

A mining company installs a new SAG mill, bringing its SAG count to three. The three SAGs are similar, but not identical. Key differences include the mill diameter, liner design and orefeed. To develop a machine learning algorithm for improving throughput in the new SAG – let’s call it SAG No.1 – data scientists build a model that behaves according to things known to be true, such as the conservation of mass. (That is, the rate of change of material in the mill is equal to the inflow rate minus the outflow rate.) Other prior understanding used in the model comes from data from SAGs No.2 and No.3, as these are expected to operate in a similar way to SAG No.1.
A major mining company has recently done just this. It used Bayesian Learning to quantify the link between (1) the amount of grinding media in a SAG (the ‘ball charge’) plus rotational speed and (2) crushing efficiency (rate at which large rocks are broken into smaller ones). The model showed that increasing the ball charge and reducing the speed would increase throughput, adding significant value.

Looking to the future
Bayesian Learning is becoming more feasible and attracting greater interest in mining. But adopting it also comes with some challenges. For one thing, this is a highly specialised branch of statistics requiring skills and expertise beyond general data science. People who embody this capability are hard to find and recruit. And as with other data science categories, competition for such talent has intensified. What’s more, because the technique is so specialised, there’s real danger of making missteps.
To set the stage for successful use of Bayesian Learning, miners will need to start educating themselves now about this approach. Additionally, they should ask themselves which operational challenges can best be overcome through Bayesian techniques.
Is all this a lot of work? Decidedly yes. But mining companies can’t afford to shy away from it if they intend to be at the forefront of operational innovation in the fastarriving future.
*Rohin Wood (wood.rohin@bcg.com) is a managing director and partner; Ric Porteous (porteous.ric@bcg.com) is a lead data scientist, Agustin Costa (costa.agustin@bcg.com) is a managing director and partner; and JT Clark (clark.jt@bcg.com) is a managing director and partner, at BCG.
Credit: Google News