Tuesday, January 19, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Machine Learning

A Reminder That Machine Learning Is About Correlations Not Causation

January 15, 2019
in Machine Learning
A Reminder That Machine Learning Is About Correlations Not Causation
588
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Credit: Google News

Lost amongst the hype and hyperbole surrounding machine learning today, especially deep learning, is the critical distinction between correlation and causation. Developers and data scientists increasingly treat their creations as silicon lifeforms “learning” concrete facts about the world, rather than what they truly are: piles of numbers detached from what they represent, mere statistical patterns encoded into software. We must recognize that those patterns are merely correlations amongst vast reams of data, rather than causative truths or natural laws governing our world.

You might also like

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business

Project MEDAL to apply machine learning to aero innovation

First Ever Artificial Intelligence/Machine Learning Action Plan by FDA

As machine learning has expanded beyond its roots in the worlds of computer science and statistics into nearly every conceivable field, the data scientists and programmers building those models are increasingly detached from an understanding of how and why the models they are creating work. To them, machine learning is akin to a black box in which you blindly feed different mixes of training data in one side, twirl some knobs and dials and repeat until you get results that seem to work well enough to throw into production.

Beyond the obvious issues that such models are extraordinary brittle, the larger issue is the way in which these models are being deployed.

It is entirely reasonable to use machine learning algorithms to sift out extraordinarily nuanced patterns in large datasets. Indeed, a very powerful application of machine learning can be around identifying all of the unexpected patterns underlying phenomena of interest in a dataset or to verify that expected patterns exist.

Where things go wrong is when we reach beyond these correlations towards implying causation.

Pattern verification is an especially powerful way of using machine learning models to both to confirm that they are picking up on theoretically suggested signals and, perhaps even more importantly, to understand the biases and nuances of the underlying data. Unrelated variables moving together can reveal a powerful and undiscovered new connection with strong predictive or explanatory power. On the other hand, they could just as easily represent spurious statistical noise or a previously undetected bias in the data.

Bias detection is all the more critical as we deploy machine learning systems in applications with real world impact using datasets we understand little about.

Perhaps the biggest issue with current machine learning trends, however, is our flawed tendency to interpret or describe the patterns captured in models as causative rather than correlations of unknown veracity, accuracy or impact.

One of the most basic tenants of statistics is that correlation does not imply causation. In turn, a signal’s predictive power does not necessarily imply in any way that that signal is actually related to or explains the phenomena being predicted.

This distinction matters when it comes to machine learning because many of the strongest signals these algorithms pick up in their training data are not actually related to the thing being measured.

Partially this is because often the thing we are most interested in cannot be directly observed through any single variable. Predicting most events, from the likelihood a user will buy a given product through the likelihood a given country will collapse into civil war tomorrow, relies on a patchwork of signals, none of which directly measure the actual thing we are interested in.

In essence, in most machine learning, the actual thing we hope to have our model learn cannot be learned directly from the data we are giving it.

This may be because the medium available (such as photographs) do not fully capture the phenomena we hope to have it recognize (such as identifying dogs). A pile of dog photographs cannot build a model to recognize their barks.

More often, it is because the thing we hope to measure (like the conversion of a website visitor into a customer) cannot be directly assessed through any single variable. Instead, we must proxy it through all sorts of unrelated variables that capture bits and pieces of the intangible “thing” we are trying to predict.

These bits and pieces, however predictive they may be, are merely genuinely or spuriously correlated with the thing we’re trying to predict. They do not necessarily cause or even explain how and why that thing occurs.

It is entirely possible to learn that a certain shade of color in a purchase button on a website makes it more likely that users will complete a sales transaction. That pattern may be strong enough that it holds true across demographics and when implemented meaningfully increases sales. The problem is that it is unlikely that that specific color is the triggering factor. The sales increase is instead likely related either to the context in which the button appears on the page or the context of that color to the site’s customer demographic.

Only by moving from correlation to causation and understanding why that pattern is so predictive, can we begin to trust that that pattern will continue to function as expected over time.

Moving from correlation to causation is especially important when it comes to understanding the conditions under which a machine learning model may fail, how long we can expect it to continue being predictive and how widely applicable it may be.

Using machine learning to identify correlative patterns in data is an extremely powerful approach to understanding both the nuances and biases of our data and the unexpected very real patterns that our current theoretical understandings failed to point us towards.

On the other hand, when we attempt to reach past this usage towards treating our models as “discovering” or “learning” causative new “natural laws” or concrete “facts” about the world, we tread upon dangerous ground.

Putting this all together, the ease with which modern machine learning pipelines can transform a pile of data into a predictive model without requiring an understanding of statistics or even programming has been a key driving force in its rapid expansion into industry. At the same time, it has eroded the distinction between correlation and causation as the new generation of data scientists building and deploying these models conflate their predictive prowess with explanatory power.

In the end, as technology places ever more powerful tools in the hands of those without an understanding of how they work, we are creating great business and societal risk if we don’t find ways of building interfaces to these models such that they are able to communicate these distinctions and issues like data bias to their growing user community that lacks an awareness of those concerns.

Credit: Google News

Previous Post

Case Study: An in-app attribution adjustment made all the difference for one healthcare org

Next Post

Five Tools That Use AI for Cybersecurity

Related Posts

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business
Machine Learning

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business

January 19, 2021
Project MEDAL to apply machine learning to aero innovation
Machine Learning

Project MEDAL to apply machine learning to aero innovation

January 19, 2021
First Ever Artificial Intelligence/Machine Learning Action Plan by FDA
Machine Learning

First Ever Artificial Intelligence/Machine Learning Action Plan by FDA

January 19, 2021
Using ‘federated learning’ to enhance predictions of COVID-19 outcomes
Machine Learning

Using ‘federated learning’ to enhance predictions of COVID-19 outcomes

January 19, 2021
AI/Machine Learning Market Size, Key Players, Segmentation, Demand, Growth, Trend, Opportunity and Forecast to 2027 – Murphy’s Hockey Law
Machine Learning

AI/Machine Learning Market Size, Key Players, Segmentation, Demand, Growth, Trend, Opportunity and Forecast to 2027 – Murphy’s Hockey Law

January 19, 2021
Next Post
Five Tools That Use AI for Cybersecurity

Five Tools That Use AI for Cybersecurity

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business
Machine Learning

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business

January 19, 2021
Microsoft Defender is boosting its response to malware attacks by changing a key setting
Internet Security

Microsoft Defender is boosting its response to malware attacks by changing a key setting

January 19, 2021
New Educational Video Series for CISOs with Small Security Teams
Internet Privacy

New Educational Video Series for CISOs with Small Security Teams

January 19, 2021
Get Hired as a Data Scientist in 2021: Six Checkpoints
Data Science

Get Hired as a Data Scientist in 2021: Six Checkpoints

January 19, 2021
Project MEDAL to apply machine learning to aero innovation
Machine Learning

Project MEDAL to apply machine learning to aero innovation

January 19, 2021
Australia’s tangle of electronic surveillance laws needs unravelling
Internet Security

Australia’s tangle of electronic surveillance laws needs unravelling

January 19, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business January 19, 2021
  • Microsoft Defender is boosting its response to malware attacks by changing a key setting January 19, 2021
  • New Educational Video Series for CISOs with Small Security Teams January 19, 2021
  • Get Hired as a Data Scientist in 2021: Six Checkpoints January 19, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates