SI 544 Introduction to Statistics and Data Analysis
 Resources SI 544 home cTools Readings, assignments, etc. will be posted to the course ctools website problem sets software tools for the class other resources instructor: Lada Adamic

 Schedule Winter 2008: Lectures will be Tuesdays and Thursdays from 9:00 to 10:30 am. On Thursdays we will usually meet in 409 West Hall, on Tuesdays we will be at the DIAD lab. Office hours: Mon 4-5pm Tues/Thurs 10:30-11:00am

Description

This course teaches the fundamentals of statistics, that is, the ability to describe data samples and draw inferences about the populations from which they were drawn. It should also sharpen individual intuition about how to read data, interpret data, and judge others' claims about data.

Specifically, at the end of this course students should be able to:
• characterize population data intuitively for themselves and others;
• draw conclusions and inferences from population data;
• check assumptions of others' claims and debug their putative "facts";
• look for correlations while controlling for confounding effects

Prerequisites: none

Reading: We will be using two textbooks:

• Introductory Statistics for the Behavioral Sciences (5th or 6th Edition) by Welkowitz, Ewen, and Cohen.
• Using R for Introductory Statistics by John Verzani

Both books are required and will be available at Ulrich's.

Assignments and grading (students will complete a small group project)

 date subject reading assignment due 1 Thu 1/3 intro S.ch1: Introduction 2 Tue 1/8 lab: descriptive statistics S.ch2-5 (descriptive statistics) R.ch1:data R.ch2:univariate data 3 Thu 1/10 probability intro McClave & Sincich Ch 3 (available on cTools) 4 Tue 1/15 lab: discrete distributions R.ch5: describing populations PS 1 5 Thu 1/17 continuous distributions S.ch9 6 Tue 1/22 lab: scatter plots and transformed scores McClave & Sincich Ch 4 & 5 (available on cTools) PS 2 7 Thu 1/24 sampling A1,A2,A3* 8 Tue 1/29 multivariate data R.ch4: multivariate data PS 3 9 Thu 1/31 concepts of statistical inference S.ch8 10 Tue 2/5 lab: outliers, confidence intervals R 7.1-7.4: Confidence intervals PS 4 11 Thu 2/7 significance testing S.ch10, S.ch11 12 Tue 2/12 lab: one and two sample tests R.ch8 PS 5 13 Thu 2/14 simple linear regression S.ch12, S.ch13 13 Tue 2/19 review for midterm catch-up on reading PS 6 14 Thu 2/21 midterm Tue 2/26 -- spring break -- Thu 2/28 -- spring break -- 15 Tue 3/4 lab: simple linear regression and correlation R. 3.3-3.4 and R 10.1 - 10.2 project progress report 16 Thu 3/6 analysis of variance S.ch15 & ch 16 17 Tue 3/11 lab: analysis of variance R.ch11 PS 7 18 Thu 3/13 statistical communication (I) A4*,A5* 19 Tue 3/18 article review 20 Thu 3/20 tabular data S.ch17 21 Tue 3/25 lab: tabular data R 8, 9.1-9.2 PS 8 22 Thu 3/27 power, multiple regression S.ch14 23 Tue 4/1 guest lecture PS 9 24 Thu 4/3 logistic regression 25 Tue 4/8 lab: multiple & logistic regression R 10.3, R 12.1 project report 26 Thu 4/10 student project presentations 27 Tue 4/15 review (leftovers in R: ) take home final given out due 4/18

*The following can be obtained from cTools:

• A1: Freakonomics Introduction: the hidden side of everything
• A2: Freakonomics 1. What do schoolteachers and sumo wrestlers have in common?
• A3: Feakonomics 5. What makes a perfect parent?
• A4: Fairness and the Assumptions of Economics
Daniel Kahneman; Jack L. Knetsch; Richard H. Thaler
The Journal of Business, Vol. 59, No. 4, Part 2, 1986
• A5: Joel Best. 2004. “Chapter 1: Missing Numbers.” in More Damned Lies and Statistics. Berkeley and Los Angeles: University of California Press.

Here are some practice exams:

2006: midterm (solution), final (solution) (tennis data set, you need to email me for Pew Survey)
2008: midterm (solution), final (solution) (MovieGenresInAsia.txt, MoviesCountryGenre.txt, BoxBudgetRating.txt)