Tuesday, March 2, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Data Science

Johns Hopkins Covid-19 Data and R, Part I — data.table handling.

May 7, 2020
in Data Science
Johns Hopkins Covid-19 Data and R, Part I — data.table handling.
585
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Summary: This blog showcases the handling of daily data of cases/deaths from Covid-19 in the U.S. published by the Center for Systems Science and Engineering at Johns Hopkins University. The technology deployed to manage and explore the data is R along with its splendid data.table package. Analysts with several months of R experience should benefit from the notebook below.

It’s pretty hard to consume any analytics’ media these days without seeing explorations of Covid-19 data. I was late to Covid EDA, but am now all in, hoping I can make even a small contribution to the pandemic response. A good starting point for Covid data is the Center for Systems Science and Engineering at Johns Hopkins University, my alma mater. The CSSE maintains a Covid-19 dashboard and posts confirmed case and fatality files daily for the U.S. and the world.

You might also like

Jumpstart your cloud transformation journey with fast object storage

(Part 2 of 4) How to Modernize Enterprise Data and Analytics Platform – by Alaa Mahjoub, M.Sc. Eng.

Benefits of Data Integration – Data Science Central

I started looking at that data about a week ago using R, planning later to examine the same data with Python and Julia. The downloadable case and death files hint of spreadsheets, with an ever-expanding date repeating group holding the case/death cumulative counts. The granularity of the data is at county or other jurisdiction within state, so ultimately a normalized relational structure would key on the combination of state, jurisdiction, and date. A problem with the data, noted on the website, is that “The time series tables are subject to be updated if inaccuracies are identified in our historical data. The daily reports will not be adjusted in these instances to maintain a record of raw data.” In other words, there are some anomalies in the data that must be accounted for. I try to manage around them best I can with summarization and moving averages.

Any data management work I do in R is built on the nonpareil data.table package, which adds immeasurable functionality to R’s native data.frame. A newbie serious about learning R for analytics should make an investment in data.table. It’ll take some time, but the rewards are well worth the effort. Python programmers are starting to see the Python data.table as a competitor to the venerable Pandas.

This is the first of a two-part series on R with the CSSE case/fatality data. Part I here details the loading/shaping/grouping of the data, while Part II will explore the data using ggplot. My hope is that readers will find some of the code useful in their own work.

The supporting platform is a Wintel 10 notebook with 128 GB RAM, along with software JupyterLab 1.2.4 and R 3.6.2. The R data.table, tidyverse, pryr, plyr, fst, and knitr packages are featured, as well as functions from my personal stash, detailed below.


Credit: Data Science Central By: steve miller

Previous Post

machine learning foundations / Boing Boing

Next Post

Samsung patches 0-click vulnerability impacting all smartphones sold since 2014

Related Posts

Jumpstart your cloud transformation journey with fast object storage
Data Science

Jumpstart your cloud transformation journey with fast object storage

March 2, 2021
(Part 2 of 4) How to Modernize Enterprise Data and Analytics Platform – by Alaa Mahjoub, M.Sc. Eng.
Data Science

(Part 2 of 4) How to Modernize Enterprise Data and Analytics Platform – by Alaa Mahjoub, M.Sc. Eng.

March 1, 2021
Benefits of Data Integration – Data Science Central
Data Science

Benefits of Data Integration – Data Science Central

March 1, 2021
The Bayesian vs frequentist approaches: implications for machine learning – Part two
Data Science

The Bayesian vs frequentist approaches: implications for machine learning – Part two

March 1, 2021
9 Tips to Effectively Manage and Analyze Big Data in eLearning
Data Science

9 Tips to Effectively Manage and Analyze Big Data in eLearning

March 1, 2021
Next Post
Samsung patches 0-click vulnerability impacting all smartphones sold since 2014

Samsung patches 0-click vulnerability impacting all smartphones sold since 2014

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Machine Learning Cuts Through the Noise of Quantum Computing
Machine Learning

Machine Learning Cuts Through the Noise of Quantum Computing

March 2, 2021
Google’s Tensorflow Certification & What I’ve Learned Since
Neural Networks

Google’s Tensorflow Certification & What I’ve Learned Since

March 2, 2021
Apple’s data-collection ‘nutrition labels’ for apps will begin appearing next week
Digital Marketing

Pinterest powers up creators during stressful times: Monday’s daily brief

March 2, 2021
Developers can now use IBM’s cloud services across multiple environments with IBM Cloud Satellite – IBM Developer
Technology Companies

Developers can now use IBM’s cloud services across multiple environments with IBM Cloud Satellite – IBM Developer

March 2, 2021
Free cybersecurity tool aims to help smaller businesses stay safer online
Internet Security

Free cybersecurity tool aims to help smaller businesses stay safer online

March 2, 2021
Gootkit RAT Using SEO to Distribute Malware Through Compromised Sites
Internet Privacy

Gootkit RAT Using SEO to Distribute Malware Through Compromised Sites

March 2, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Machine Learning Cuts Through the Noise of Quantum Computing March 2, 2021
  • Google’s Tensorflow Certification & What I’ve Learned Since March 2, 2021
  • Pinterest powers up creators during stressful times: Monday’s daily brief March 2, 2021
  • Developers can now use IBM’s cloud services across multiple environments with IBM Cloud Satellite – IBM Developer March 2, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates