Areal
Interpolation: Acquiring Employment and Business Establishment Data Using GIS
Although we need annual employment and business establishment
data by census tract in Detroit Empowerment Zone (DEZ), the data have not been
issued by census tract. However, the U.S Department of Commerce has annually
issued the employment and establishment data by Zip code, which is called Zip
code Business Patterns. Therefore, the key to addressing this problem is how to
get necessary data in the case that a statistical boundary with data does not
coincide in space with a boundary that we need the data. Areal interpolation is
spatial data analysis methods to counter this problem.
What is Areal
Interpolation?
First of all, let us define the zone with necessary data as a source
zone, and the zone without needed data as a target zone. In this project, zip
code areas are source zones and census tracts of DEZ are target zones. When
target zones do not coincide with source zones, we need the areal interpolation
method to obtain data for the target zones from the source zones. Therefore,
the areal interpolation is algorithmic methods to acquire estimate values of
target zones from known values of source zones. Although initial applications
of areal interpolation procedures were mainly isopleth mapping, the areal
interpolation methods have increasingly contributed to transformation of data
from source zone boundaries to target zone boundaries (Lam, 1983: p.138).
According to Lam (1983), the areal interpolation methods are grouped in
two: non-volume preserving methods and volume preserving methods.
Non-volume-preserving methods employ the approach based on point interpolation,
termed the " point-based-areal interpolation". The approach is not
good means due to its poor practice evidences and particularly the
characteristic that it does not conserve the total value within each zone. That
is, we cannot reconstruct exactly the original value of each source zone with
the transformed value of each target zone (Lam, 1983: pp. 138-139).
Volume-preserving methods called the "area-based areal
interpolation approach" overcome the above-shortcoming of
non-volume-preserving methods because no point interpolation process is
required. Of different methods fallen within this approach, the simplest method
is the overlay method called by Lam (1983: 139) or the areal weighting method
called by Flowerdew and Green (1994). The overlay method of areal interpolation
superimposes the target zones on the source zones. Since this project utilized
the areal weighting method, the algorithm of this way is introduced, based on
Lam’s (1983: 139-140).
Algorithms of
Areal Interpolation
Matrix A is a "m target zones by n source zones matrix"
representing the area of each zone, with elements denoted bythat represents the intersection area of a target and a
source zone. Also, let V a column vector of target zone values with
the length of m, and U a column vector of source zone values with the length of
n. From this notation, we have three algorithms to get data for a target zone
t, denoted by Vt.
(1)
Vt
=, when data are a form of absolute figures or counts—this is
the same as the calculation for the case of extensive variable in Flowerdew and
Green (1994: 124). In this equation, and ats denote
respectively the area of source zone s and the area of intersection of
target zone t and source zone s. We can represent this equation
as a matrix notation like V = WU, where W is a weight matrix containing
elements of .
(2)
Vt
=, when data are density data such as population density. In
this equation, denotes the target zone area. This is the same as the
calculation for the case of intensive variable in Flowerdew and Green (1994:
124).
(3)
Vt
=, when data are in the form of ratios or proportions such as
percent of males in the population. This equation is to simply compare two
absolute figures from the equation (1).
Based on these algorithms, this project used equation (1) to get DEZ’s
annual employment and establishment data. In this project, target zones are 48
census tracts of DEZ and source zones are 17 zip code areas, as can be seen in
Map 3.
Since this computation is based on the area portion of a target zone to
a source zone, Flowerdew and Green (1994) refer to this as the areal weighting
method.
Most GIS software provides this overlay/areal weighting method for
areal interpolation. For example, in ArcView, perform the following process:
Dowall (1996) also introduces areal interpolation by the overlay/areal
weighting method in estimating employment data for California’s enterprise
zones.
Limitations
of Areal Interpolation
Although this areal interpolation method by areal weighting is very
simple and convenient, a major problem of this method is that it assumes even
distribution of data throughout a source zone. However, in reality we can
hardly accept the hypothesis of the even distribution, as population of a city
is usually not evenly distributed throughout the city area.
One of alternatives to overcome the shortcoming above is areal
interpolation using ancillary data introduced by Flowerdew and Green (1994). If
we have relevant ancillary information about uneven distribution of source
zones, we can utilize the information to make more realistic estimates of data
for target zones by grouping source zones into two categories. We assign the
value of the source zone (or appropriate proportion) to one of the categories
and zero value to the other. As a simple example, if we have information where lakes
are in a city, we can assign zero as the population data for the lake areas
although the areal portion of the lakes is not zero. This project used this
method since some census tracts include rivers or lakes.
Another alternative is the pycnophylactic interpolation method that
originally was introduced by Tobler for isopleth mapping (Lam, 1983: 140). This
method considers the effect of neighborhood source zones in valuing a source
zone. Lam introduces a basic procedure of this method as the following. (Lam:
148-149) First, split source zones by small grid cells, and assign mean value
of each source zone to each cell, and then, get the new value for each cell by
considering the value of neighbor cells. Iterate this process until there are
no significant changes of grid values compared with the last iteration. Second,
just aggregate the grid cells into target zone boundaries and sum the grid
values. This method assumes the existence of a smooth density function which
takes into account the effect of adjacent source zones. Kennedy and Tobler
(1983) show a way to get missing values of any zone using adjacent zones’
density values based on Dirichlet condition.
As stated till now, not ubiquitous is the case that satisfies the
assumption of overlay/areal weighting method. However, the alternative methods
are also not perfect ways to estimate data for target zones, and we do not
always have relevant ancillary information as well as strong support from
existing GIS tool unlike the overlay method.
In particular, the annual employment data gleaned through this project
will use to see annual changes of the values in a same area (like a census
tract). Therefore, as Dowall (1996) argues that his approach is valid and
reasonable because he places his emphasis on changes in employment of
enterprise zones over time, there is no strong reason to reject the assumption
of even distribution in my overlay methods.
References
Arlinghaus, Sandra. 1999. Course Homepage, NRE 530, Geography: Spatial
Analysis, Theory and Practice. http://www.csfnet.org/530
Chung, Chae Gun and Yalin Chao. Employment for Detroit Empowerment
Zone, 1994 ~ 1996. Course Project for UP 507 Geographic Information Systems.
April 2000.
Clarke, Keith C. 1999. Getting Started With Geographic Information
Systems. Upper Saddle River: Prentice-Hall.
Dowall, David E. An Evaluation of California’s Enterprise Zone Program.
Economic Development Quarterly, Vol. 10. No. 4, November 1996, 352-368.
Flowerdew, Robin and Mick Green. 1994. Areal Interpolation and Types of
Data. In Spatial Analysis and GIS edited by Stewart Fotheringham and Peter
Rogerson. London, Bristol: Taylor & Francis Ltd.
Kennedy, Susan and Waldo R. Tobler. Geographic Interpolation.
Geographical Analysis, Vol. 15, No. 2, April 1983.
Lam, Nina Siu-Ngan. Spatial Interpolation Methods: A Review. The
American Cartographer, Vol. 10, No. 2, 1983, pp. 129-149.
Understanding GIS: The ARC/INFO Method. Redland, California:
Environmental Systems Research Institute, Inc. 1997.