interpolation

Areal Interpolation: Acquiring Employment and Business Establishment Data Using GIS

Although we need annual employment and business establishment data by census tract in Detroit Empowerment Zone (DEZ), the data have not been issued by census tract. However, the U.S Department of Commerce has annually issued the employment and establishment data by Zip code, which is called Zip code Business Patterns. Therefore, the key to addressing this problem is how to get necessary data in the case that a statistical boundary with data does not coincide in space with a boundary that we need the data. Areal interpolation is spatial data analysis methods to counter this problem.

What is Areal Interpolation?

First of all, let us define the zone with necessary data as a source zone, and the zone without needed data as a target zone. In this project, zip code areas are source zones and census tracts of DEZ are target zones. When target zones do not coincide with source zones, we need the areal interpolation method to obtain data for the target zones from the source zones. Therefore, the areal interpolation is algorithmic methods to acquire estimate values of target zones from known values of source zones. Although initial applications of areal interpolation procedures were mainly isopleth mapping, the areal interpolation methods have increasingly contributed to transformation of data from source zone boundaries to target zone boundaries (Lam, 1983: p.138).

According to Lam (1983), the areal interpolation methods are grouped in two: non-volume preserving methods and volume preserving methods. Non-volume-preserving methods employ the approach based on point interpolation, termed the " point-based-areal interpolation". The approach is not good means due to its poor practice evidences and particularly the characteristic that it does not conserve the total value within each zone. That is, we cannot reconstruct exactly the original value of each source zone with the transformed value of each target zone (Lam, 1983: pp. 138-139).

Volume-preserving methods called the "area-based areal interpolation approach" overcome the above-shortcoming of non-volume-preserving methods because no point interpolation process is required. Of different methods fallen within this approach, the simplest method is the overlay method called by Lam (1983: 139) or the areal weighting method called by Flowerdew and Green (1994). The overlay method of areal interpolation superimposes the target zones on the source zones. Since this project utilized the areal weighting method, the algorithm of this way is introduced, based on Lam’s (1983: 139-140).

Algorithms of Areal Interpolation

Matrix A is a "m target zones by n source zones matrix" representing the area of each zone, with elements denoted bythat represents the intersection area of a target and a source zone_. Also, let V a column vector of target zone values with the length of m, and U a column vector of source zone values with the length of n. From this notation, we have three algorithms to get data for a target zone t, denoted by V_t.

(1) V_t =, when data are a form of absolute figures or counts—this is the same as the calculation for the case of extensive variable in Flowerdew and Green (1994: 124). In this equation, and a_ts denote respectively the area of source zone s and the area of intersection of target zone t and source zone s. We can represent this equation as a matrix notation like V = WU, where W is a weight matrix containing elements of .

(2) V_t =, when data are density data such as population density. In this equation, denotes the target zone area. This is the same as the calculation for the case of intensive variable in Flowerdew and Green (1994: 124).

(3) V_t =, when data are in the form of ratios or proportions such as percent of males in the population. This equation is to simply compare two absolute figures from the equation (1).

Based on these algorithms, this project used equation (1) to get DEZ’s annual employment and establishment data. In this project, target zones are 48 census tracts of DEZ and source zones are 17 zip code areas, as can be seen in Map 3.

Since this computation is based on the area portion of a target zone to a source zone, Flowerdew and Green (1994) refer to this as the areal weighting method.

GIS and Areal Interpolation

Most GIS software provides this overlay/areal weighting method for areal interpolation. For example, in ArcView, perform the following process:

To split the zip code boundary layer in Map 3, go to Theme | Start Editing. Take "Portion" at Split Rule in Theme | Properties | Editing. Push split polygon button in the tool bar. Then split zip code boundaries along the census tract lines as can be seen in Map 4. See 110 newly created polygons with employment and establishment data in Map 5.
Perform a union to sum the data by census tract. For this, take "Add" at Union Rule in Theme | Properties | Editing, as can be seen in Map 6.
We can get Map 7 with annual employment and establishment data by census tract of DEZ.

Dowall (1996) also introduces areal interpolation by the overlay/areal weighting method in estimating employment data for California’s enterprise zones.

Limitations of Areal Interpolation

Although this areal interpolation method by areal weighting is very simple and convenient, a major problem of this method is that it assumes even distribution of data throughout a source zone. However, in reality we can hardly accept the hypothesis of the even distribution, as population of a city is usually not evenly distributed throughout the city area.

One of alternatives to overcome the shortcoming above is areal interpolation using ancillary data introduced by Flowerdew and Green (1994). If we have relevant ancillary information about uneven distribution of source zones, we can utilize the information to make more realistic estimates of data for target zones by grouping source zones into two categories. We assign the value of the source zone (or appropriate proportion) to one of the categories and zero value to the other. As a simple example, if we have information where lakes are in a city, we can assign zero as the population data for the lake areas although the areal portion of the lakes is not zero. This project used this method since some census tracts include rivers or lakes.

Another alternative is the pycnophylactic interpolation method that originally was introduced by Tobler for isopleth mapping (Lam, 1983: 140). This method considers the effect of neighborhood source zones in valuing a source zone. Lam introduces a basic procedure of this method as the following. (Lam: 148-149) First, split source zones by small grid cells, and assign mean value of each source zone to each cell, and then, get the new value for each cell by considering the value of neighbor cells. Iterate this process until there are no significant changes of grid values compared with the last iteration. Second, just aggregate the grid cells into target zone boundaries and sum the grid values. This method assumes the existence of a smooth density function which takes into account the effect of adjacent source zones. Kennedy and Tobler (1983) show a way to get missing values of any zone using adjacent zones’ density values based on Dirichlet condition.

As stated till now, not ubiquitous is the case that satisfies the assumption of overlay/areal weighting method. However, the alternative methods are also not perfect ways to estimate data for target zones, and we do not always have relevant ancillary information as well as strong support from existing GIS tool unlike the overlay method.

In particular, the annual employment data gleaned through this project will use to see annual changes of the values in a same area (like a census tract). Therefore, as Dowall (1996) argues that his approach is valid and reasonable because he places his emphasis on changes in employment of enterprise zones over time, there is no strong reason to reject the assumption of even distribution in my overlay methods.

References

Arlinghaus, Sandra. 1999. Course Homepage, NRE 530, Geography: Spatial Analysis, Theory and Practice. http://www.csfnet.org/530

Chung, Chae Gun and Yalin Chao. Employment for Detroit Empowerment Zone, 1994 ~ 1996. Course Project for UP 507 Geographic Information Systems. April 2000.

Clarke, Keith C. 1999. Getting Started With Geographic Information Systems. Upper Saddle River: Prentice-Hall.

Dowall, David E. An Evaluation of California’s Enterprise Zone Program. Economic Development Quarterly, Vol. 10. No. 4, November 1996, 352-368.

Flowerdew, Robin and Mick Green. 1994. Areal Interpolation and Types of Data. In Spatial Analysis and GIS edited by Stewart Fotheringham and Peter Rogerson. London, Bristol: Taylor & Francis Ltd.

Kennedy, Susan and Waldo R. Tobler. Geographic Interpolation. Geographical Analysis, Vol. 15, No. 2, April 1983.

Lam, Nina Siu-Ngan. Spatial Interpolation Methods: A Review. The American Cartographer, Vol. 10, No. 2, 1983, pp. 129-149.

Understanding GIS: The ARC/INFO Method. Redland, California: Environmental Systems Research Institute, Inc. 1997.