Learning Data Science Using Covid-19 Pandemic Data
Preface
1
Introduction
1.1
Data and Counts
1.2
Sensitivity and Specificity
1.3
Learning from Count Data
1.4
Book Outline
1.5
Additional Sources on Covid-19
2
Preliminaries
2.1
Libraries and Setup
2.2
Utility functions
2.2.1
Multiple Plots
2.2.2
Caterpillar Plot
3
Reading Data into R
Comments about R code
3.1
Pulling data from a git repository
3.1.1
Issues around dealing with downloaded data
3.1.2
Issues around dealing with positive tests and death counts
3.2
Scraping data from the web example
3.3
Using APIs
3.4
Number tested in US
3.5
State-level information
3.5.1
School Closure Data
3.5.2
State-level Policy Data
3.5.3
State-level poverty percentage
3.5.4
ADL
3.6
Other data sources
3.7
Summary
4
Descriptive Plots
4.1
World Map
4.1.1
World Map with Time Animation
4.2
Compare World Counts to US
4.3
US State-Level Plots
4.3.1
Animated Map of US
4.3.2
US county-level plots
4.4
Incidence Plots
4.5
Using PCA to check for outliers
4.6
Summary
5
Descriptive Modeling: Basic Regressions
5.1
Regression on US State Percentages: Linear Mixed Models on Logs
5.1.1
Examine distributional assumptions
5.1.2
Linear mixed model
5.1.3
Diagnostics
5.1.4
Comparison with polynomial regression
5.1.5
Thoughts on the log transformation
5.2
Exponential Nonlinear Mixed Model Regression
5.2.1
Exponential Background
5.2.2
Exponential: The Liftoff Parameter
5.2.3
Estimating the exponential with the nonlinear mixed model
5.3
Logistic Nonlinear Mixed Regression
5.3.1
Additional Predictors: Example with State-Level Poverty Rate
5.3.2
Additional assumption checking: random effect parameters
5.4
Evaluating Exponential and Logistic Growth Models
5.5
Splines, Smoothers and Time Series
5.6
Bayesian Estimation of Nonlinear Regression: Adding value to prediction and uncertainty
5.6.1
Bayesian extensions
5.7
State-space modeling
5.8
Basic Machine Learning Examples
5.8.1
Simple RNN example
5.8.2
Gaussian Process Regression
5.8.3
Feature engineering
5.9
Summary
6
Process Models
6.1
SIR Models
6.1.1
Exponential and Logistic Growth Models
6.1.2
SIR summary
6.2
Network-based SIR Models
6.3
Agent-based Models
6.4
Public Policy Models
6.5
Summary
7
Deeper Understanding
7.1
Regression Discontinuity
7.2
Hospital Readmission Data
7.3
Deaths
7.4
Functional Equations
8
Conclusion
Appendix
A
Appendix A: Using Git
A.1
Installing git
A.1.1
Git on a PC
A.1.2
Git on Mac
A.2
Git Commands (both PC and Mac)
A.3
Brief Git Explanation
A.4
Using Git beyond this book
B
Appendix B: Reporting on Git and Session Info in R
C
Appendix C: Saving Workspace
D
Appendix D: Learning R
E
Appendix E: The bookdown package
Published with bookdown
Learning Data Science Using Covid-19 Pandemic Data
D
Appendix D: Learning R
There are many tutorials on basic R commands.
Pending: add some material, refer to tutorials