UP504
(Prof. Campbell)
Web-Based
Data Bases
last updated January 15, 2002 |
Sections of this document:
Overview definitions US Census census forms census geography census 2000 other sources mapping other issues |
Other UP504 class pages of interest:
other useful statistical sites overview of US Census sources |
When you are to gather or construct a data table, there are several dimensions to consider:
1. time (single point in time, comparative statics, time-series)
2. space (geographic location: e.g., city, county, MSA, state, country)
3. unit of analysis (e.g., person, household)
4. variables (e.g., annual income, age, occupation)
exploratory-inductive: But
sometimes serendipity leads to unexpected data.
Sample vs.
Full Count (Census)
sample size
- N
population
size - M
sampling
fraction = N/M
normally
we assume that N/M -> 0 (that is, one is sampling a very small fraction
of the population)
Data Sources (and Citations)
1. paper
2. electronic based on a paper published version
3. electronic with no paper published source
(also: data tapes)
1. Netscape Explorer (to view this document) -- or use any browser
2. Netscape Composer (to create this document) -- or any other web page authoring application.
3. FTP (to download and upload this page to my ifs space so that it is available on the web); One MAC version is Fetch.
4. Excel -- to analyze downloaded data (or use SPSS, SAS, Systat, etc.)
5. Adobe Acrobat (to read formatted
.pdf files)
census
OED, 2nd ed.
census se.nss, sb. [L. census registering of Roman citizens and their property, registered property, wealth, f. censere to rate, assess, estimate. ]
1. The registration of citizens and their property in ancient Rome for purposes of taxation.
2. Applied to certain taxes, esp. a capitation or poll-tax. Obs.
3.
a. An official enumeration of the population of a country or district, with various statistics relating to them. Also attrib.
A census of the population has been taken every tenth
year since 1790 in the United States of America, since 1791 in France,
and since 1801 in Great Britain. In Ireland the earliest census was in
1813, since which it has been taken simultaneously with
that of Great Britain.
b. attrib., as in census return,
-table,
-taker; census-paper, a paper left at each house, to be filled up with the names, ages, etc., of the inmates, and returned to the enumerators on the day of taking the census.
-----
ENCYCLOPAEDIA BRITANNICA
http://www.britannica.com
census
an enumeration of people, houses, firms, or other
important items in a country or
region at a particular time. Used alone, the term
usually refers to a population
census--the type to be described in this article.
However, many countries take
censuses of housing, manufacturing, and agriculture.
-----
statistic
OED, 2nd ed.
statistic stati.stik, a. and sb. [ad. G. statistik sb.
statistisch adj., Fr. statistique adj. and fem. sb., ad. mod.L. statisticus,
f. *statista (Ital. statista) statist. Cf. Ital. statistico adj.,
statistica sb., Sp., Pg. estadÌstico adj., estadÌstica
sb. The earliest known occurrence of the word seems to be in the title
of the satirical work Microscopium Statisticum, by `Helenus Politanus',
Frankfort (?), 1672. Here the sense is prob. `pertaining
to statists or to statecraft' (cf. statistical a. 1). The earliest use
of the adj. in anything resembling its present meaning is found in mod.L.
statisticum collegium, said to have been used by Martin
Schmeizel (professor at Jena, died 1747) for a course of lectures on the
constitutions, resources, and policy of the various States of the
world. The G. statistik was used as a name for this department
of knowledge by G. Achenwall in his Vorbereitung zur Staatswissenschaft
(1748); the context shows that he did not regard the term
as novel. The Fr. statistique sb. is cited by LittrÈ
from Bachaumont (died 1771); Fr. writers of the 18th c. refer to Achenwall
as having brought the word into use. The sense-development of the
word may have been influenced by the notion that it was
a direct derivative of L; status state sb. ]
B. sb.
1.
a. = statistics 1. rare.
b. A quantitative fact or statement.
c. Statistics. Any of the numerical characteristics of
a sample (as opposed to one of the population from which it is drawn).
Cf. parameter 2 f.
2. = statistician.
-------
sample
sample s.mp'l, , sb. Forms: 4 sampel, saumpel, -pul, -ple,
saunpil, 4-5 saumpil, 4-6 sampill, saumple, 5 sampil(le, sampull, saumpyl,
4- sample. [ME. sample, aphetic f. essample: see
example sb. ]
1. A fact, incident, story, or suppositious case, which serves to illustrate, confirm, or render credible some proposition or statement. (Cf. example sb. 1.) Obs.
2.
a. A relatively small quantity of material, or an individual
object, from which the quality of the mass, group, species, etc. which
it represents may be inferred; a specimen. Now chiefly Comm., a
small quantity of some commodity, presented or shown
to customers as a specimen of the goods offered for sale. (An individual
article offered as a specimen of goods sold by number and not by
weight or measure is now more commonly called a pattern.)
b. of immaterial things.
c. A specimen taken for scientific testing or analysis.
d. Statistics. A portion drawn from a population, the
study of which is intended to lead to statistical estimates of the attributes
of the whole population.
The term "census" has at least three common uses:
1. as a type of count: a full count (at least in theory) rather than a sample
2. as a data set: the actual count of the U.S. population every ten years. Hence Decennial censuses (every 10 years - 1980, 1990, 2000, etc.)
3.
as
a government agency: the government agency that administers this
count (the Bureau of the Census, which is under the Department of Commerce).
Note: the decennial census is but one of MANY sets of data that the
agency collects.
The U.S. Constitution provides for a census of the population every 10 years, primarily to establish a basis for apportionment of members of the House of Representatives among the States. For over a century after the first census in 1790, the census organization was a temporary one, created only for each decennial census. In 1902, the Bureau of the Census was established as a permanent Federal agency, responsible for enumerating the population and also for compiling statistics on other subjects. Historically the census of population has been a complete count. That is, an attempt is made to account for every person, for each person's residence, and for other characteristics (sex, age, family relationships, etc.). Since the 1940 census, in addition to the complete count information, some data have been obtained from representative samples of the population. In the 1990 census, variable sampling rates were employed. For most of the country, 1 in every 6 households (about 17 percent) received the long form or sample questionnaire; in governmental units estimated to have fewer than 2,500 inhabitants, every other household (50 percent) received the sample questionnaire to enhance the reliability of sample data for small areas. Exact agreement is not to be expected between sample data and the complete census count. Sample data may be used with confidence where large numbers are involved and assumed to indicate trends and relationships where small numbers are involved.
Census data presented here have not been adjusted for underenumeration. Results from the evaluation program for the 1990 census indicate that the overall national undercount was between 1 and 2 percent the estimate from the Post Enumeration Survey (PES) was 1.6 percent and the estimate from Demographic Analysis (DA) was 1.8 percent. Both the PES and DA estimates show disproportionately high undercounts for some demographic groups. For example, the PES estimates of percent net undercount for Blacks (4.4 percent), Hispanics (5.0 percent), and American Indians (4.5 percent) were higher than the estimated undercount of nonHispanic whites (0.7 percent). Historical DA estimates demonstrate that the overall undercount rate in the census has declined significantly over the past 50 years (from an estimated 5.4 percent in 1940 to 1.8 percent in 1990), yet the undercount of Blacks has remained disproportionately high.
link: The
2000 U.S. Census
Where is each person counted?
The
2000 Census Residence Rules
including for foreigners
For the 1990 Census:
"Each person included in the census was to be counted
at his or her
usual residence--the place where he or she lives and
sleeps most of the
time or the place where the person considers to be his
or her usual
home. If a person had no usual residence, the person
was to be counted
where he or she was staying on April 1, 1990.
Persons temporarily away from their usual residence,
whether in the
United States or overseas, on a vacation or on a business
trip, were
counted at their usual residence. Persons who occupied
more than one
residence during the year were counted at the one they
considered to be
their usual residence. Persons who moved on or near Census
Day were
counted at the place they considered to be their usual
residence."
How about students?
"Persons Away at School-- College students
were counted as residents of the area in which they
were living while attending college, as they have been
since the 1950
census. Children in boarding schools below the college
level were
counted at their parental home"
APPENDIX D. Collection
and Processing Procedures
questionnaire type | who received the questionnaire | Format of Compiled Census Data (Summary Tape File) |
long form | a sample (either 1/6 or 1/2 or 1/8 of hhds. receive this form, depending on population size of location): overall: 1-in-6. see documentation on sampling rates. | STF3 |
short form | full count (every hhd. receives this form) | STF1 |
In between the 10 Year Census -- How are population estimates made?
Current Population
Survey (CPS)
This is a monthly nationwide survey
of a scientifically selected sample representing the noninstitutional civilian
population. The sample is located in 754 areas comprising 2,121 counties,
independent cities, and minor civil divisions with coverage in every State
and the District of Columbia and is subject to sampling error. At the present
time, about 50,000 occupied households are eligible for interview every
month; of these between 4 and 5 percent are, for various reasons, unavailable
for interview.
While the primary purpose of the CPS
is to obtain monthly statistics on the labor force, it also serves as a
vehicle for inquiries on other subjects. Using CPS data, the Bureau issues
a series of publications under the general title of Current Population
Reports, which cover population characteristics (P20), consumer income
(P60), special studies (P23), and other topics.
Urban and rural÷
Hispanic
(many be of any racial category - so don't add with racial categories,
since it cuts across racial categories)
see US
Census definition
A
Hierarchy of Census Areas (from the 1990 Census): from BIG
to small
1 | Nation (US) |
4 | Regions (e.g., Midwest) |
9 | Divisions (e.g., East North Central) |
57 | States and Statistically Equivalent Entities (e.g., Michigan) |
3,248 | Counties and Statistically Equivalent Entities (e.g., Washtenaw) |
60,228 | County Subdivisions and Places (e.g., Ann Arbor) |
576 | American Indian and Alaska Native Areas |
62,276 | Census Tracts and Block Numbering Areas (BNAs) |
229,192 | Block Groups (BGs) |
7,017,427 | Blocks |
What are blocks?
"Census blocks are small areas bounded on all sides by
visible
features such as streets, roads, streams, and railroad
tracks, and by
invisible boundaries such as city, town, township, and
county limits,
property lines, and short, imaginary extensions of streets
and roads.
source: technical
documentation
Geographic Areas: MSAs, CMSAs, etc.
Metropolitan
Areas: Detroit as an example
35
Detroit-Ann Arbor-Flint, MI CMSA
35 0440
Ann Arbor, MI PMSA
35 0440 26091
Lenawee County
35 0440 26093
Livingston County
35 0440 26161
Washtenaw County
35 2160
Detroit, MI PMSA
35 2160 26087
Lapeer County
35 2160 26099
Macomb County
35 2160 26115
Monroe County
35 2160 26125
Oakland County
35 2160 26147
St. Clair County
35 2160 26163
Wayne County
35 2640
Flint, MI PMSA
35 2640 26049
Genesee County
Population in the Detroit-Ann Arbor-Flint,MI CMSA and its three component
MSAs,
1980 - 1994 (in thousands)
METROPOLITAN AREA | 1980 | 1990 | 1991 | 1992 | 1993 | 1994 | 1980-90 | 1990-94 |
Detroit-Ann Arbor-Flint,MI CMSA | 5,293 | 5,187 | 5,215 | 5,236 | 5,246 | 5,256 | -2.0 | 1.3 |
Ann Arbor, MI PMSA | 455 | 490 | 498 | 504 | 509 | 515 | 7.7 | 5.1 |
Detroit, MI PMSA | 4,388 | 4,267 | 4,285 | 4,299 | 4,304 | 4,307 | -2.8 | 0.9 |
Flint, MI PMSA | 450 | 430 | 432 | 432 | 433 | 433 | -4.4 | 0.7 |
GUIDE TO FIPS CODES:
MSA= Metropolitan Statistical Area
CMSA= Consolidated Metropolitan Statistical Area
PMSA= Primary Metropolitan Statistical Area
SS= State
CCC= County
PPPPP= Place (city/town)
Type of Metropolitan Area | Number | Example | |
MSA (metropolitan statistical area) | stand alone
metro area (a county or counties)
|
268 | (e.g., Lansing-East Lansing, MI MSA) |
CMSA (consolidated MSA) | a very large metro area, consisting of a collection of PMSAs | 21 | (e.g., Detroit-Ann Arbor-Flint, MI CMSA) |
PMSA (primary MSA) | a subset of CMSAs | 73 | (e.g., Ann Arbor, MI PMSA) |
New York CMSA has 15 PMSAs
LA CMSA has four (albeit big ones)
Detroit CMSA has three: Ann Arbor, Detroit, and Flint.
MA (Metropolitan Area) The MA classification
is a statistical standard developed for use by Federal agencies in the
production, analysis, and publication of data on MAs. The MAs are designated
by the Office of Management and Budget. Metropolitan Areas can be classified
as a Metropolitan Statistical Area (MSA) or as a Consolidated Metropolitan
Statistical Area (CMSA), that is a MA divided into Primary Metropolitan
Statistical Areas (PMSAs.) See also MSA/CMSA/PMSA.
PMSA (Primary Metropolitan Statistical
Area) An area defined by the Office of Management and Budget as a Federal
statistical standard, comprised of one or more counties (county subdivisions
in New England), within a metropolitan area, having a population of 1,000,000
or more. When PMSAs are established, the larger area of which they are
component parts is designated a Consolidated Metropolitan Statistical Area.
CMSA (Consolidated Metropolitan
Statistical Area) An area defined by the Office of Management and Budget
as a Federal statistical standard. In metropolitan areas where Primary
Metropolitan Statistical Areas (PMSAs) are defined, the larger area of
which the PMSAs are components is designated a CMSA.
MSA (Metropolitan Statistical
Area) An area defined by the Office of Management and Budget as a Federal
statistical standard. An area qualifies for recognition as an MSA if it
includes a city of at least 50,000 population or an urbanized area of at
least 50,000 with a total metropolitan area population of at least 100,000.
See also (MA).
NECMA (New England County Metropolitan
Area) A county-based equivalent to the official metropolitan areas in the
six New England States, where the standard components are county subdivisions
(cities and towns) instead of counties as in other states.
For descriptive
details and a listing of titles and components of MA's, see Appendix II.
Metropolitan
Areas (MA's)
The general concept of a metropolitan
area is one of a core area containing a large population nucleus, together
with adjacent communities that have a high degree of social and economic
integration with that core.
Metropolitan statistical areas (MSA's),
consolidated metropolitan statistical areas (CMSA's),
and primary metropolitan statistical areas (PMSA's)
are defined by the Office of Management and Budget (OMB) as a standard for Federal agencies in the preparation and publication of statistics relating to metropolitan areas.
The entire territory of the United
States is classified as metropolitan (inside MSA's or CMSA'súPMSA's
are components of CMSA's) or nonmetropolitan (outside MSA's or CMSA's).
MSA's, CMSA's, and PMSA's are defined in terms of entire counties except in New England, where the definitions are in terms of cities and towns. The OMB also defines New England County Metropolitan Areas (NECMA's) which are countybased alternatives to the MSA's and CMSA's in the six New England States. From time to time, new MA's are created and the boundaries of others change. As a result, data for MA's over time may not be comparable and the analysis of historical trends must be made cautiously. For descriptive details and a listing of titles and components of MA's, see Appendix II.
Also, New England has NECMAs: New England
county MA. Place and county alternatives to the standard MAs
The 2000 Census --
Early
Results
2000
Census: FAQ (frequently asked questions)
new in 2000: ability
to select multiple racial categories.
http://www.census.gov/population/www/censusdata/c2kproducts.html
time
table of data products release from 2000 Census
format of data made available: "Census 2000 data will be disseminated
mainly using a new data retrieval system called the American
FactFinder (AFF)"
http://www.census.gov/datamap/fipslist/mafips96.txt
American Fact Finder (the US Census new Interactive database engine)
http://factfinder.census.gov/servlet/BasicFactsServlet
explain .pdf
files and Adobe Acrobat Reader.
see State and Metropolitan Area Data
Book
http://www.un.org/Pubs/CyberSchoolBus/special/habitat/profiles/
http://www.lib.umich.edu/libhome/Documents.center/stats.html
http://www.lib.umich.edu/libhome/Documents.center/michstat.html
one example:
http://www.cdc.gov/nchswww/products/pubs/pubd/other/atlas/atlas.htm
http://www.esri.com/data/online/mapstudio.html
What to do
with missing data.
What to do
with categories not adding to 100%. (rounding error? missing data? double
counting? e.g., with Hispanic wrongly added to race.)
How to deal
with suppressed data.
Interpolation
and extrapolation.