Classification is the systematic
grouping of objects or events into classes on the basis of properties or
relationships they have in common (Abler, Adams, and Gould, 1971). Groups are commonly understood as clusters
of events or objects defined in terms of similarity. However, a primary question is what we mean when we say that ** A**
is similar to

For instance, Point **P **and** Q** on X, Y plane has each position value, **P **(*x _{1}, y_{1}*)
and

I should designate an area functioning
as a control area as opposed to DEZ as an experimental area. The control area must have similar
characteristics to DEZ. This is a
typical classification problem. Combining
this basic concept and method for classification with GIS helps me search a
control area for my project. For the
control areas, it is pertinent to search the area with similar socioeconomic
conditions to DEZ because the criteria of designating the EZ are largely based
on the socioeconomic condition of the area.
The OTU’s in my case can be census tracts because the EZ consists of
census tracts.

A simple way is to consider only
one variable that is the most likely critical in representing the socioeconomic
characteristic of each census tract.
After computing z-scores of each census tract for the critical variable
selected, we draw a thematic map for the variable, which shows clusters of
values classified using the distances measured by z-scores between every pair
of OTU’s.**[i]****
** And then, we designate arbitrarily
a sub-cluster—as a control area-- within a larger area which falls into the
same class to DEZ.

However, any one variable cannot thoroughly represent characteristics of an area, although the percentage of unemployment or the poverty rate can be a good variable to represent a socioeconomic condition of a census tract. In addition, in the case that the variables are inter-correlated among themselves, it results in difficulty in estimating the exact amount of correlation of a variable with the socioeconomic condition. Factor analysis (FA) can be a good tool to classify an area when we want to consider simultaneously all highly inter-correlated variables in a multi-dimensional taxonomic space.

Let me describe how I designated a control area for DEZ using the FA method.

- First, in 1990 census tract data of the Detroit
Metropolitan Area with 1080 census tracts (observations/ OTU’s), I
selected 25 variables, which might determine
the socioeconomic condition with demographic, income, poverty, education,
occupation, and housing features.
- Second, I computed z-score values for all tracts to
standardize the value with different measurement.
- Third, the factor analysis was
executed using SPSS. After
extracting eigenvalues for all variables, I found that four factors have
the eigenvalues greater than one.
However, the promax rotation showed me that two factors prominently
represent unique socioeconomic characteristics. Factor 1 largely represents the characteristic related to
income, and factor II does that related to poverty.
- Fourth, based on this information, I executed two analyses to get factor scores for
each tract. I pursued it with only
one factor, which represents the general socioeconomic condition of each
tract. Then, I pursued it with two
factors, in which each factor represents the income condition and the
poverty condition.

**(Notation
in the factor analysis and the thematic map)**

**fac1_1**: the general socioeconomic condition for each tract using one factor analysis.**fac1_2**: the income condition for each tract using two-factor analysis.**fac2_2**: the poverty condition for each tract using two-factor analysis.

**Step I:**a thematic map with the factor scores of**fac1_1**was drawn, in which the Detroit Metropolitan Area was classified by census tract based on the general socioeconomic condition including income and poverty.**Step II:**I overlaid the boundary map of DEZ on the thematic map.**Step III:**I looked for some potential census tracts that can be a member of a control area.- Potential census tracts (candidates) for the
control area are the tracts that satisfy the following conditions
simultaneously.
- fac1_1 score of a tract falls into the range, "mean of DEZ’s fac1_1 +or- 1.96*standard deviation of DEZ’s fac1_1".
- fac1_2 score of a tract falls into the range, "mean of DEZ’s fac1_2 +or- 1.96*standard deviation of DEZ’s fac1_2".
- fac2_2 score of a tract falls into the range, "mean of DEZ’s fac2_2 +or- 1.96*standard deviation of DEZ’s fac2_2".

**Step IV:**I designated a cluster as a control area made up of the candidate tracts which fall into the same class as DEZ in terms of factor scores of fac1_1, fac1_2, and fac2_2 together.

- Finally, I tested whether or not the socioeconomic
characteristic of the control area represented by the factor scores was
significantly similar to that of DEZ with raw values of original 25
variables. After some iteration of
the test, I finally created an area which is not significantly different
from DEZ in terms of the general socioeconomic condition, the income
condition and the poverty condition at the same time.

[i] ArcView offer a default classification method, termed Natural Break. This method identifies breakpoints between classes using a statistical formula (Jenk optimization). This method is rather complex, but basically the Jenk method minimizes the sum of the variance within each of the classes. Natural Breaks finds groupings and patterns inherent in your data.

Abler, Ronald, J. S.
Adams, and P. Gould. 1971. *Spatial
Organization: The Geographer’s view of the World*: Ch. 6. Englewood Cliffs:
Prentice- Hall.

Agresti, Alan and Barbara
Finlay. 1986. *Statistical Methods for the
Social Sciences*: 514-517. San Francisco: Dellen Publishing Company.

Arlinghaus, Sandra. 1999. Course Homepage, NRE 530,
Geography: Spatial Analysis, Theory and Practice. http://www.csfnet.org/530

Chung, Chae Gun and Yalin Chao. Employment for Detroit
Empowerment Zone, 1994~1996. *Course
Project for UP 507 Geographic Information Systems*. April 2000.

Clarke, Keith C. 1999. *Getting
Started With Geographic Information Systems*. Upper Saddle River:
Prentice-Hall.

Kim, Jae-on and Charles W.
Mueller. 1978. Factor Analysis: Statistical Methods and Practical Issues.
Beverly Hills: Sage Publications, Inc.