SI 544 Introduction to Statistics and Data Analysis


SI 544 home

Readings, assignments, etc. will be posted to the course ctools website

problem sets

software tools for the class

Lada Adamic



Fall 2010:

Lectures will be
Tuesdays and Thursdays
from 8:30 to 10:00 am.
Location NQ1255

Office hours:
TuTh 10:00-10:30 and Fri 2-3pm in NQ4360




This course teaches the fundamentals of statistics, that is, the ability to describe data samples and draw inferences about the populations from which they were drawn. It should also sharpen individual intuition about how to read data, interpret data, and judge others' claims about data.

Learning objectives. At the end of this course students should be able to:
  • construct a data sample appropriate for a given question/hypothesis and understand biases that can be introduced through sampling
  • select appropriate methods to analyze such samples to determine whether the hypothesized effects are statistically significant
  • critically analyze the sampling methods and analysis of others (e.g. don't take what the popular press tries to feed you about the latest health-related finding -- be able to read the source study yourself)
  • stop worrying and love the data

Prerequisites: none

Instructor: Lada Adamic

Reading: There are two required textbooks:

  • Se5 (5th edition) or Se6 (6th edition) Introductory Statistics for the Behavioral Sciences by Welkowitz, Ewen, and Cohen.
  • Re1 (1st edition), Re2 (2nd edition) Introductory Statistics with R by Dalgaard -- can be downloaded through UofM library search for "Introductory Statistics with R"

Accommodations for students with disabilities

Academic integrity policy

We will be using R in class. R is a statistical programming language, and it is open source. You should bring a laptop to every class for hands-on in-class exercises. If you don't have one, please contact the instructor to arrange for a loaner laptop during classtime.

Assignments and grading (students will complete a small group project)

see finished projects from Winter '09

Course Syllabus (click on PDF/PPT icon to download lab notes)

  date subject reading assignment due
1 Tue 9/7 intro S.ch1: Introduction  
2 Thu 9/9 descriptive statistics S.ch2-5 (descriptive statistics)
Re1.ch1: Basics or Re2.ch1: Basics and Re2.ch2: the R environment
3 Tue 9/14

probability intro

McClave & Sincich Ch 3 (available on cTools) PS 1 due 9/15
4 Thu 9/16 discrete distributions: the binomial and hypergeometric Re1.ch2/Re2.ch3: probability and distributions
McClave & Sincich Ch 4.1-4.4, 4.6 (available on cTools)
PS 2 due 9/20
5 Tue 9/21 practice with discrete distributions
6 Thu 9/23

poisson distribution,

transformed scores and the normal distribution

McClave & Sincich Ch 4.5: the poisson
Se5.ch9/Se6.8: Normal distribution
Se5.ch6/Se6:7: Z and T scores
PS 3 due 9/27
7 Tue 9/28 sampling A1,A2,A3,A5*  
8 Thu 9/30 graphical descriptions of data Se5.ch9/Se6.8: Additional techniques for describing batches of data
Re1.ch3/Re2.ch4: descriptive statistics and graphics
9 Tue 10/5 concepts of statistical inference Se5.ch8&ch9/Se6.ch9 PS 4
10 Thu 10/7 outliers, confidence intervals,significance testing Se5/Se6.ch10  
11 Tue 10/12 one sample tests
PS 5
12 Thu 10/14 two sample tests Se5/e6.ch11  
  Tue 10/19 -- fall study break--  


13 Thu 10/21 midterm review  

PS 6 due

form group & select topic by 10/21

14 Tue 10/26 midterm (in class, open book)  
15 Thu 10/28 simple linear regression
16 Tue11/2 more regression and correlation Re1.ch5,Re2.ch6 article review due
17 Thu 11/4 analysis of variance Se6.ch15 & ch 17
Se5.ch15 & ch 16
18 Tue 11/9 more analysis of variance Re1.ch6, Re2.ch7 PS 7 due
19 Thu 11/11 tabular data, chi-squared Se5.ch17, Se6.ch20 project progress report due
20 Tue 11/16 more tabular data Re1.ch7, Re2.ch8 PS 8 due
21 Thu 11/18 power, multiple regression Se5&Se6: ch14  
22 Tue 11/23 logistic regression Re1:ch9&ch11, Re2: ch10&ch12 PS 9 due
  Thu 11/25 -- Thanksgiving break---    
23 Tue 11/30 statistical communication (I) A4* PS 10 due
24 Thu 12/2 catch-up    
25 Tue 12/7 student project presentations    
26 Thu 12/9 student project presentations  
27 Tue 12/14 review session   project report due 12/13
28 Thu 12/15   take home final given out due 12/17

*The following can be obtained from cTools:

  • A1: Freakonomics Introduction: the hidden side of everything
  • A2: Freakonomics 1. What do schoolteachers and sumo wrestlers have in common?
  • A3: Feakonomics 5. What makes a perfect parent?
  • A4: Fairness and the Assumptions of Economics
    Daniel Kahneman; Jack L. Knetsch; Richard H. Thaler
    The Journal of Business, Vol. 59, No. 4, Part 2, 1986
  • A5: Joel Best. 2004. “Chapter 1: Missing Numbers.” in More Damned Lies and Statistics. Berkeley and Los Angeles: University of California Press.

Here are some practice exams:

2006: midterm (solution), final (solution) (tennisdata.txt, tennisballweights.txt, you need to email me for Pew Survey)
2008: midterm (solution), final (solution) (MovieGenresInAsia.txt, MoviesCountryGenre.txt, BoxBudgetRating.txt)

2010: midterm (solution)