Martin and Riener Technical Services
Oregon State University**
When Claude Elwood Shannon effectively laid the foundations of information theory in his mathematical theory of communication (Shannon, 1948), he showed how something as seemingly intangible as information and its transmission could be studied quantitatively. As is often the case when new insight is gained, the theory was expressed formally in the language of mathematics, but its understanding was motivated by analogy to other fields of study. For example, the concept of "entropy", usually associated with disorder or the direction in which energy tends to flow, in the contexts of statistical mechanics or thermodynamics, was adopted to refer to information, when information is treated in terms of probability.
Geographic Information Science and cartography are concerned with the communication of geographic information, and Tobler (1997) has commented on the relevance of information theory to cartography. Interesting and fruitful analogies have been developed between problems of Geographic Information Science (including cartography) and those of other fields of science and engineering.
For example, Johnson (2004) recently wrote on the analogy between physical diffusion models and the generation of cartograms, referring to the work of University of Michigan physicist Mark Newman on efficient computer algorithms to generate maps in which the areas of entities (such as states) are transformed to represent relative attributes of those entities (such as number of electoral votes). In his review of computer cartograms, Tobler (2004) notes that Rushkin (1971) made the analogy between cartograms and the physical model of a rubber sheet map with inked dots whose number and placement represented, again, some attribute such as number of electoral votes. In this analogy, stretching the rubber map to uniform dot density then results in a cartogram.
Newman, viewing diffusion (such as that of dopant atoms within a silicon chip during semiconductor manufacture) likewise as a process of density equalization, perceived that the same theory could be applied to the computer generation of cartograms. So cross-discipline analogy often leads to improved techniques and better understanding.
Following Shannon's work, the terms "entropy" and "energy" are used commonly in the contexts of information science and signal processing. This leads to the suggestion of another term of analogy, namely that of "information impedance matching".
In electrical (power) and electronics (signal) engineering, "impedance" is defined as the ratio of voltage to current, and the term might be understood intuitively as a relation between how much is offered and how much flows (and to what degree the flow is in sync with the force). Already this hints at communication.
Impedance has meaning in other areas of wave propagation, too: for example, in optics the "index of refraction" is a ratio of impedances; the impedance of glass to the passage of light is higher than that of air, so there is refraction, making lenses possible. In geophysics, different rock impedances result in different speeds of propagation of seismic waves, and there is reflection of these waves at impedance discontinuities, making possible subsurface profiling using seismic waves.
As suggested above, impedance discontinuities result in a sort of "bounce", and prevent an unimpeded flow of energy. “Impedance matching” is a design process to achieve the most effective transfer of energy from one part of a circuit to another part by matching the parts, with certain characteristics, in a complementary way: the impedances should be complex conjugates at the operating frequency resistances should be equal and the capacitive reactance of one should equal the inductive reactance of the other. For example, to produce as much light as possible with a certain battery and an incandescent bulb use a bulb not of extremely high resistance nor of extremely low resistance, but of resistance equal to the battery’s (internal) resistance. Match the filament impedance to the battery impedance, and you will maximize the flow of energy.
Similarly, in the communication of information the best transfer of information occurs when the data are displayed at the resolution indicated by the source: one pixel to one pixel. This is not merely a matter of finding the most powerful way to represent information, but rather of optimizing the compatibility of the information representation used by the source, and the information representation used by the receiver. Thus, in loose analogy to the principle of electrical impedance matching, the principle of information impedance matching suggests itself, to describe the optimal matching of an information source and an information receiver. This analogy seems appropriate because the fields of information theory and signal processing implicitly recognize the equivalence of energy and information when speaking of “energy compaction” in transform coding for purposes of compression and similar ideas (Goyal, 2001). Geographic data, as spatial or temporal distributions of values, can be regarded as cases of “signals”, carriers of information to be detected by those who use maps or other displays of geographic information. Thus, the application of ideas from signal processing and information and communication theory is justified, and may provide practical insight for spatial data communication in geographic information science.
The material that follows offers some reflections on geographic information impedance matching derived from the electrical/electronics analogy noted above. To apply impedance matching ideas to the communication of geographic information, one might first inquire as to the likely location and quantity of information that can be carried by a geographic data set, in a given representation, such as an image or an electronic file. Representations are not absolute. They should not be confused with that which they represent, notwithstanding Wittgenstein’s (1922) claim that a symbol must have something in common with that which it represents. Representations might closely resemble reality; they are, however, mere models or symbols—not reality, and can be transformed without loss of information.
The squared norm of a signal, as the sum of the squares of all its component values, is a measure of its deviation from the origin of function space, and is called its energy. “Energy compaction” refers to use of a transform to arrive at a coordinate system in which the location of the signal in function space can be approximated by a few orthogonal components in the new coordinate system. Then the projection of the signal onto those components’ axes contains most of the signal’s energy. It can be argued that this coordinate system allows a more “natural” representation of the signal, at least for the purposes of information transmission and storage. The sum of mutually-exclusive projection energies should equal the full-dimension energy.
Energy is always defined in relation to a frame of reference, such as a coordinate system, and the same is true of information. This is not the distinction between information and meaning or truth noted by many (e.g. Tobler, 1997), but rather the distinction between that part of a signal that is certain, given the context, and that part which "comes as a surprise", as a variance, and truly informs. If all signals from a source have a bias, we may ignore the bias without losing information. We may tell the source not to send the bias; we will simply adjust the origin of our coordinate system to reproduce it.
Information is the figure perceived against the ground, even if the location of meaning can be argued (Hofstadter, 1979). The ground reference is the context of the signal, and if the source and receiver agree on this common ground, then just the figure can be transmitted, for that is the carrier of information. In this case, source and receiver are well matched. If this is not the case, the ratio of information flow to data flow becomes small. Awareness of these ideas might prevent errors in geographic information science.
Cartographers are concerned with the effective transfer of spatial information, which depends on attention to information impedance matching in data collection, data conversion, and data representation. The risks of neglecting information impedance matching are information loss, pseudo-information generation, and loss of efficiency
In data collection, one rule is not to record more apparent significant digits of a numerical measurement than are justified by the precision of the instrument (or by other practical or theoretical considerations of maximum possible precision). Ignorance of this rule results in an information impedance mismatch insofar as much of the flow of numbers conveyed is overburden, not representing information.
In data conversion, any sort of re-sampling or re-projection of data likely constitutes an information impedance mismatch. Re-projection generally entails interpolation of a regular grid, where original data are discarded and new data are created. It is inevitable that information will be lost and pseudo information created in this process; the severity of the mismatch is of the most interest. Different spatial patterns will be affected in different ways by such mismatches. For example, even if a grid of data is simply “reprojected” to a grid of coarser resolution, it may in some cases be preferable simply to subsample; in other cases it may be preferable to use a convolution filter, to minimize discontinuities or to maintain subband definition for purposes of scale analysis. In the familiar case of image size reduction, subsampling might preserve “sharpness” of certain features, while resampling with a convolution filter may better display continuity of areas and of edges not aligned with the grid axes.
A form of pseudo-information that might arise in resampling is aliasing. Kimerling (2000) reported on the Moire-like patterns apparent in data quality maps of resampled equal-angle grids. Information impedance mismatching can produce similar patterns (or similarly-caused patterns) in the presentation of the data itself, which should be of considerable concern to those who prepare and analyze spatial information.
Figure 1 is a pair of images of the 256
x 256 discrete cosine transform (DCT) matrix. The image on the right was
resized twice, which is expected to produce subsampling discontinuities.
Displayed properly, the images would reveal hyperbolic bands bending toward
the upper left corner, and a fainter hyperbolic cross at 2/3 across and
2/3 down from upper left. Any other variations seen are aliasing artifacts
that result from information impedance mismatching. When viewing this document
on a computer screen, try changing the magnification; the patterns should
Figure 1. 2562 DCT matrix. Reduced and enlarged image on right.
These DCT matrix representations have undergone several conversions, including the conversion of 32-bit floating point numbers to 8-bit integers as well as conversions involving resizing, transformation raster to vector data and vector to raster data. There may be the illusion that we have a picture of the original matrix, whereas what we really have may be a picture of a representation of a conversion of... an original representation of the matrix itself.
Because of the profusion of electronically-manipulated spatial data and the demands to reformat data sets for compatibility, it is incumbent upon those who work in geographic information science to be consciously aware of the distinction between the signal and its representation, and to minimize conversion and representation mismatches.
Loss of efficiency in information transmission matters because when efficiency drops, so does communication —consider for example a cluttered map, or a web page that is slow to load. Any representation of information is a sort of symbol. The key to avoiding information impedance mismatch is to have a sense of what the essential information components of a set of data are, and to employ representations that encode information in similar terms (and in this respect we are in accord with Wittgenstein).
All representations are models, or symbols. The customary way to represent certain geographic data may not be the most “natural” choice. For example, graphs comprising lines and plots are not efficiently represented by the JPEG (Joint Photogaphic Experts Group) image format, whose components are smooth waves; such information would be represented more efficiently in vector format, such as EPS (Encapsulated PostScript), or if they must be in raster format for compatibility, then GIF (Graphics Interchange Format), PNG (Portable Network Graphics) or compressed TIFF (Tagged Image File Format) might be a better choice —the ratio of display quality to file size would be much higher.
Information impedance matching applies
to non-spatial data as well. Common examples of information impedance mismatches
that result in loss of efficiency are the conversion of documents from
one format to another, and the conversion of text to image. In the parlance
of signal analysis, the latter is a projection from one signal space to
a much higher dimension signal space. Conversely, when numerical data (such
as images) are encoded as ASCII (American Standard Code for Information
Interchange) characters, as they are for email, a double conversion has
taken place, with resulting loss of efficiency. A case in point is the
passage of information electronically over the Internet, as in the case
of a map server or other geographic data server. The price paid for a poor
choice of data format is slow transfer of data. Data are not necessarily
information, and it is only information that really needs to be
Information impedance matching can be summarized
as facilitating information flow by making appropriate joints and transmission
lines between information source and information receiver, employing transformation
where appropriate, but avoiding it otherwise. Modular thinking cannot be
discarded, but geographic information science practitioners must take the
responsibility to understand the so-called “transparent” processes, such
as data conversion or reformatting, that affect their geographic information
and its effective communication.
Hofstadter, D., 1979. The location of meaning. In Godel, Escher, Bach: an Eternal Golden Braid, chapter 6, pages 158-180. Basic Books, New York.
Johnson, R.C., 2004. Chip diffusion modeling yields better maps. EE Times, May 27, 2004 (1:00 PM EDT).
Rushton, G., 1971. Map transformations of point patterns: Central place patterns in areas of variable population density. Papers and Proceedings, Regional Science Association, 28: 111-129.
Shannon, C., 1948. A mathematical theory of communication. Bell System Technical Journal, 27: 379-423 and 623-656.
Tobler, W., 1997. Introductory comments on information theory and cartography. Cartographic Perspectives, 27: 4-7.
Tobler, W., 2004. Thirty five years of computer cartograms. Annals, Association of American Geographers, 94(1): 58-73.
Wittgenstein, L., 1922. Tractatus Logico-philosophicus. Harcourt Brace & Co., New York.