SI 544 Introduction to Statistics and Data Analysis


SI 544 home

Readings, assignments, etc. will be posted to the course ctools website

problem sets

software tools for the class

other resources

Lada Adamic


Winter 2008:

Lectures will be
Tuesdays and Thursdays
from 9:00 to 10:30 am.
On Thursdays we will usually meet in 409 West Hall, on Tuesdays we will be at the DIAD lab.

Office hours:
Mon 4-5pm

Tues/Thurs 10:30-11:00am



SI 544 project

You will be working in groups, and will share the same grade with your group members. This project will account for 20 points of your final grade.

The idea is to collect some data and analyze it. You can collect new data (e.g. conduct a survey or an experiment) or get a hold of a previously collected data set (e.g. web server logs, data from a previous study). The hypothesis you test should not have been examined before with regard to that particular data set.

Past projects have included showing that basketball players with a history of arrests for violent offences score more often per minuteof play, correlating temperatures and beer consumption in Ann Arbor, etc. This year, Hung Truong, Sameer Halai, and James Laing created a FaceBook app and survey to see whether people's impression of sociability of instruments corresponds to sociability on FaceBook.

project timeline

due date
turn in on 2/19
select a project topic and formulate the question you will answer with the data
following week
schedule 20 minutes when all members of your group can discuss your topic with the instructor

collect a preliminary set of data

submit a progress report including
  • 1 page explaining your question and a description of how you are collecting the data
  • a copy of a sample of your preliminary data
  • collect data
  • perform the data analysis

submit a project report including

  • a plot or a group of plots on a single page showing the most important features of your data
  • a summary statistic answering the question you were posing
  • a discussion of your results, including possible sources of bias, hidden variables, etc.
  • if your results were not statistically significant, estimate how large of a data sample you would have needed to get significant results
15 min project presentation