Friday, April 23, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Data Science

Predicting Heart Disease using Machine Learning? Don’t!

November 15, 2020
in Data Science
Predicting Heart Disease using Machine Learning? Don’t!
587
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Image Credits: Unsplash

You might also like

Strategies for a successful Voice of the Customer program

How Python Interpreter Works? – Data Science Central

What Does The Future Hold For the Companies Developing Mobile Apps

I was recently invited to judge a Data Science competition. The students were given the ‘heart disease prediction’ dataset, perhaps an improvised version of the one available on Kaggle. I had seen this dataset before and often come across various self-proclaimed data science gurus teaching naïve people how to predict heart disease through machine learning.

I believe the “Predicting Heart Disease using Machine Learning” is a classic example of how not to apply machine learning to a problem, especially where a lot of domain experience is required.

Let me unpack the various problems in applying machine learning to this data set.

Dive straight into the problem syndrome

Well, this is the first mistake many people make. Jumping straight into the problem and thinking which Machine learning algorithm to apply. Doing EDA as part of this process is not *thinking* about the problem. Rather it is a sign that you have already accepted the notion that the problem needs a data science solution. Instead, one of the pertinent questions that need to be asked before starting any analysis is, “Is this problem even predictable through the application of machine learning?”.

Blind faith in Data

This is an extension of the first point. Diving straight into the problem means you have blind faith in the data. People assume the data to be true and do not make an effort to scrutinize the data. For example, the dataset only provided systolic blood pressure. If you spoke to any doctor or even a paramedic, they would tell you that systolic blood pressure alone does not give the full picture. Reporting of the diastolic level is important too. Many don’t even ask the question, “are the features enough to predict the outcome or more features are needed.”

Not enough data per patient

Let’s take a look at the data set above. If you notice, there is only one data point under each feature for a patient. The fundamental problem here is that features like blood pressure, cholesterol, heartbeat are not static. They range. The blood pressure of a person varies from hour to hour, and daily, so does heartbeat. So when it comes to the prediction problem, there is no telling whether 135 mm hg blood pressure was one of the factors to cause the heart disease or was it 140, all while the data set might be reporting 130 mm hg. Ideally, multiple measurements need to be had for each feature for a patient.

Now let’s come to the crux of the matter

Applying algorithm without domain experience

One reason for the high failure rate of data science application in health care is that the data scientists applying the algorithm do not have adequate medical knowledge.

Secondly, in healthcare, causality is taken very seriously. Many rigorous clinical and statistical tests are conducted to infer causality.

In the case study, any machine learning algorithm is just trying to map the input to the output while reducing some error metric. Also, the machine learning algorithms by themselves arenot classifiers. We make them as classifiers by setting some cut-off or threshold. Again, this cut off is not decided to deduce causality but to get “favorable metrics.”

Aggravating this problem is the usage of low code libraries. This case study is a case point example of why low code libraries can be dangerous. Low code libraries fit a dozen or more algorithms. Most are not even aware of how some of these algorithms work! They pick the ‘best’ algorithm based on metrics like F1, Precision, Recall, and Accuracy.

The low code libraries that fixate on accuracy metrics lead to ‘Goodhart’s law’ — ‘When a measure becomes a target, it ceases to be a good measure.’

Image Credits: SKetchplanations

If you are predicting, you are implying causation. In healthcare, a mere prediction is not enough. One needs to prove causation. Machine learning classifier algorithms do not answer the ‘causation’ part.

Believing they have solved a real healthcare problem

Last but not least, many believe that by fitting an ML algorithm to a *healthcare* data set and getting some accuracy metrics, they have solved a real healthcare problem. Nothing can be further from the truth than this, especially when it pertains to the healthcare domain.

In conclusion:

There are perhaps thousands of business problems that genuinely warrant data science/machine learning solutions. But at the same time, one should not fall into the trap of “To a person with a hammer, everything looks like a nail.” Seeing everything as a nail (data science problem) and machine learning algorithms (hammer) can be very counterproductive. Much of the 80% failure rate in data science applied to business problems could be attributed.

Good data scientists are like Good doctors. Good doctors suggest conservative treatments first before prescribing heavy dosage medicines or surgery. Similarly, a good data scientist should ask certain pertinent questions first before blindly applying a dozen ML algorithms to the problem.

Doctor: Surgery :: Data Scientist : Machine learning

Your comments and opinions are welcome.

You can reach out to me on

Linkedin


Credit: Data Science Central By: Venkat Raman

Previous Post

Machine Learning Scientist / Applied Researcher — GoDaddy

Next Post

COVID-19 Update: Global Machine Learning Courses Market is Expected to Grow at a Healthy CAGR with Top players: EdX, Ivy Professional School, NobleProg, Udacity, Edvancer, etc.

Related Posts

Strategies for a successful Voice of the Customer program
Data Science

Strategies for a successful Voice of the Customer program

April 23, 2021
How Python Interpreter Works? – Data Science Central
Data Science

How Python Interpreter Works? – Data Science Central

April 23, 2021
What Does The Future Hold For the Companies Developing Mobile Apps
Data Science

What Does The Future Hold For the Companies Developing Mobile Apps

April 22, 2021
Cloud Computing In Healthcare: Main Problems Companies Encounter
Data Science

Cloud Computing In Healthcare: Main Problems Companies Encounter

April 22, 2021
MLOps: Comprehensive Beginner’s Guide – Data Science Central
Data Science

MLOps: Comprehensive Beginner’s Guide – Data Science Central

April 22, 2021
Next Post
Machine Learning as a Service Market 2020: Potential growth, attractive valuation make it is a long-term investment | Know the COVID19 Impact | Top Players: Amazon, Oracle Corporation, IBM, Microsoft Corporation, Google Inc., etc.

COVID-19 Update: Global Machine Learning Courses Market is Expected to Grow at a Healthy CAGR with Top players: EdX, Ivy Professional School, NobleProg, Udacity, Edvancer, etc.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Evolving ITOps with AIOps with no-code AI training with Cloud Pak for Watson AIOps – IBM Developer
Technology Companies

Evolving ITOps with AIOps with no-code AI training with Cloud Pak for Watson AIOps – IBM Developer

April 23, 2021
Best free PC antivirus software in 2021
Internet Security

Best free PC antivirus software in 2021

April 23, 2021
Cybercriminals Using Telegram Messenger to Control ToxicEye Malware
Internet Privacy

Cybercriminals Using Telegram Messenger to Control ToxicEye Malware

April 23, 2021
Strategies for a successful Voice of the Customer program
Data Science

Strategies for a successful Voice of the Customer program

April 23, 2021
European Values Confront AI Innovation in EU’s Proposed AI Act  
Artificial Intelligence

European Values Confront AI Innovation in EU’s Proposed AI Act  

April 23, 2021
Artificial Intelligence and Machine Learning: Demographics & Firmographics
Machine Learning

Global Federated Learning Solutions Market (2020 to 2028)

April 23, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Evolving ITOps with AIOps with no-code AI training with Cloud Pak for Watson AIOps – IBM Developer April 23, 2021
  • Best free PC antivirus software in 2021 April 23, 2021
  • Cybercriminals Using Telegram Messenger to Control ToxicEye Malware April 23, 2021
  • Strategies for a successful Voice of the Customer program April 23, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates