Saturday, March 6, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Data Science

The difference between Statistics and Data Science: Big Data and Inferential Statistics

February 24, 2020
in Data Science
The difference between Statistics and Data Science: Big Data and Inferential Statistics
586
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

This post is based on two insightful threads I read online (References below)

You might also like

A Plethora of Machine Learning Articles: Part 2

The Effect IoT Has Had on Software Testing

Why Cloud Data Discovery Matters for Your Business

Based on these, I address the question of ‘The difference between Statistics and Data Science’. Traditionally, most people, including me, would say that ‘statistics came first and Data Science builds upon statistics’. This chain of thought is valid but as you see below – it misses a much bigger picture – that of emphasis. Note that – Here, we discuss a purist approach for the sake of learning. In practice, the domains and the tools are converging

The two main differences between a purist statistical approach and a data scientist approach are:

  1. The use of Big Data (common in data science) and
  2. The use of Inferential statistics (common in statistics).

 

So, with this background, here are some differences in approaches from a purist statistical standpoint which differ from the typical datascience approach

  • Small data: We are so used to the world of big data – that we do not fully appreciate that another world exists – that of ‘small data’. But in some domains, small data is very common especially in medicine, clinical trials etc because the procedures are risky and expensive. So, it you end up with 20 or 30 samples only (small data). This leads to the greater reliance on inferential statistics
  • The use of inferential statistics: Inferential statistics use a random sample of data taken from a population to describe and make inferences about the population. Inferential statistics are valuable when examination of each member of an entire population is not convenient or possible. For example, to measure the diameter of each nail that is manufactured in a mill is impractical. You can measure the diameters of a representative random sample of nails. You can use the information from the sample to make generalizations about the diameters of all of the nails. Source: minitab. Statistics makes more use of the inferential / frequentist approach because of small data sizes (as above) 
  • Increased reliance on Domain knowledge: The first two points also lead to a greater reliance on domain knowledge for statistics – for example in the choice of features.
  • Confirmatory data analysis: Exploratory data analysis is complemented by Confirmatory data analysis
  • Increased reliance on Statistical tests many of which are domain specific
  • Statistics needs interpretive models as opposed to black box models.
  • Data science emphasises automation – in contrast to statistics which involves greater manual intervention due to the above factors (such as the increased use of domain knowledge)
  • Handling outliers and imputation: Much greater emphasis on manual correction of outliers and imputation (missing values)

 

To conclude, the difference in approaches originates from the use of small data. While the above is a purist approach i.e. in practice – tools and techniques across the domains are more fluid. References below (including the comments on these threads). Image source – the pioneering statistician George Box and his book the Accidental statistician – which made me think that we are all accidental statisticians!   

References

Isaac Faber on linkedin – If I had to guess, I would say that curre…

Adrian-Olszewski on Quora – Why do so many statisticians not want t…

 


Credit: Data Science Central By: ajit jaokar

Previous Post

Dow Futures Crash 520 Points as China Makes Shocking Coronavirus Confession

Next Post

The Curious Case of Data Annotation and AI

Related Posts

A Plethora of Machine Learning Articles: Part 2
Data Science

A Plethora of Machine Learning Articles: Part 2

March 4, 2021
The Effect IoT Has Had on Software Testing
Data Science

The Effect IoT Has Had on Software Testing

March 3, 2021
Why Cloud Data Discovery Matters for Your Business
Data Science

Why Cloud Data Discovery Matters for Your Business

March 2, 2021
DSC Weekly Digest 01 March 2021
Data Science

DSC Weekly Digest 01 March 2021

March 2, 2021
Companies in the Global Data Science Platforms Resorting to Product Innovation to Stay Ahead in the Game
Data Science

Companies in the Global Data Science Platforms Resorting to Product Innovation to Stay Ahead in the Game

March 2, 2021
Next Post
The Curious Case of Data Annotation and AI

The Curious Case of Data Annotation and AI

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Cyberattack shuts down online learning at 15 UK schools
Internet Security

Cyberattack shuts down online learning at 15 UK schools

March 6, 2021
Facebook enhances AI computer vision with SEER
Machine Learning

Facebook enhances AI computer vision with SEER

March 6, 2021
Microsoft Exchange zero-day vulnerabilities exploited in attacks against US local governments
Internet Security

Microsoft Exchange zero-day vulnerabilities exploited in attacks against US local governments

March 6, 2021
Hands-on Guide to Interpret Machine Learning with SHAP –
Machine Learning

Hands-on Guide to Interpret Machine Learning with SHAP –

March 6, 2021
$100 in crypto for a kilo of gold: Scammer pleads guilty to investor fraud
Internet Security

$100 in crypto for a kilo of gold: Scammer pleads guilty to investor fraud

March 6, 2021
Revolution by Artificial Intelligence, Machine Learning and Deep Learning in the healthcare industry
Machine Learning

Revolution by Artificial Intelligence, Machine Learning and Deep Learning in the healthcare industry

March 6, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Cyberattack shuts down online learning at 15 UK schools March 6, 2021
  • Facebook enhances AI computer vision with SEER March 6, 2021
  • Microsoft Exchange zero-day vulnerabilities exploited in attacks against US local governments March 6, 2021
  • Hands-on Guide to Interpret Machine Learning with SHAP – March 6, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates