Cart: Computer software for making cartograms

This page contains technical descriptions of the cart software for creating density-equalizing maps or "cartograms."

Outline of the code

The cart software comes in two parts. The first part, which is found in the files cart.c and cart.h, contains all the code necessary to calculate a cartogram using the method of Gastner and Newman (2004). This code is in the form of a library of C functions, which can be called to perform the complete cartogram calculation. The second part, found in the file main.c, is a simple example application that makes use of the library to construct a complete working program for calculating cartograms. This second part does no real calculations itself; it is merely an outer wrapper that calls functions in cart.c. Below I describe the functions in cart.c in detail, followed by a brief description of main.c.

There are also three other files included in the distribution – cart2.c, cartv.c, and cart2v.c – which are drop-in replacements for cart.c that implement the cartogram calculation in slightly different ways. The file cart2.c contains code to calculate cartograms using a 2nd-order Runge-Kutta integrator, instead of the default 4th-order one; cartv.c uses an alternative function evaluator for the velocity field; and cart2v.c does both. All of these tactics reduce the amount of memory the program uses, which is why they are included here. They may be useful for calculations of very large cartograms on machines with smaller memory capacities. However, all of them have drawbacks too, being either slower or less accurate than the code contained in cart.c. See here for details. The bottom line is, if you have the memory available to use cart.c, you should do so.

The interface for calling functions in all versions of the code is identical. It is described in detail below. Programs written to work with one version can be compiled without change against any other version and will behave appropriately. The overall cartograms calculated in each case should be identical, or very nearly the identical, allowing for the small inaccuracies generated by the 2nd-order integrator.

Description of the library functions for calculating cartograms

The file cart.c contains six functions that can be called to complete the calculation of a cartogram. The file cart.h contains prototypes for these six, and should be included in any program that intends to make use of cart.c. (The same include file is also used for the drop-in alternatives cart2.c, cartv.c, and cart2v.c.) There follows a description of the use of the six functions.

cart_makews(): The first function that must be called, before you do anything else, is cart_makews(). This function allocates the internal working space that will be used by the cartogram code. The form of the function call is

  int xsize,ysize;

where the integers xsize and ysize are the horizontal and vertical sizes of the grid of densities that you will be using to make your cartogram, as described in our paper. For instance, if you are making a population cartogram and your population data are on a 1024x512 grid, then you would set xsize=1024 and ysize=512 before calling cart_makews(xsize,ysize), or you could simply call cart_makews(1024,512).

The space allocated by cart_makews() can be unallocated again using the function cart_freews(), which should be called at the end of your program after the calculations are complete, and which is described below.

Note that a call to cart_makews() takes some time, because the function performs some fairly complex calculations. For a large cartogram it could take a few seconds. If you are calculating more than one cartogram of the same size then you need call cart_makews() only once for all of them. This can save you quite a lot of time. If you want to make a series of cartograms of different sizes, however, you need to free up the workspace and reallocate it again for each one, using calls to cart_freews() and cart_makews().

A note about grid sizes: since this code is based on fast Fourier transforms, it works most efficiently when xsize and ysize are powers of two, like 512 and 1024. Other values will work, but the code will run slower. If you wish to use other values, then values that have only small prime factors work better than values with larger prime factors. For instance, if you want one dimension of your grid to fall between 512 and 1024, then 768 would be a good choice, since it has prime factors only of 2 and 3.

cart_dmalloc(): The next task in making your cartogram is to define the density distribution, such as population density if you are making a population cartogram. The Fourier transform library that I use in this code requires that the density function be stored in a slightly non-standard way, and to avoid bothering you with the details, I've written a function to deal with this for you. This is the function cart_dmalloc(). It creates a two-dimensional array of the right format, into which you can place your density data. The form of the call is

  int xsize,ysize;
  double **density;

  density = cart_dmalloc(xsize,ysize);
Again xsize and ysize are the horizontal and vertical dimensions of the grid of densities. When cart_dmalloc() returns, density will point to an empty two-dimensional array of doubles of size xsize by ysize. That is, density[ix][iy] is the (ix,iy) element of the grid.

You should now assign values to all the elements of density[][], equal to the densities that you wish to use for your cartogram. Note that the cartogram transformation doesn't depend on the overall magnitude of the density variables: you can multiply them all by a single constant, and the cartogram produced will stay the same. So you don't need to worry about the units you use. Any units are fine.

You should notice also that the indices of density[][] are in the order x-coordinate then y-coordinate. If you are used to thinking in latitude-longitude pairs, you may be accustomed to putting the y-coordinate first. So beware! You have been warned: x first, then y.

cart_transform(): The next step in creating your cartogram is to Fourier transform the density data. If you don't know what this means, don't worry: think of it simply as feeding your newly assigned density values into the program so it knows the correct data for the cartogram. The call to do this is

  int xsize,ysize;
  double **density;

where the parameters are as before.

cart_dfree(): Once you have fed the density data into the code using the cart_transform() call, the density array is never needed again in this calculation, and you can, if you wish, delete it to save memory. The call to do this is

  double **density;

You are not required to do this; the program will work fine if you don't, although it will use more memory. In some cases you may not want to delete the array. For instance, if you are making more than one cartogram of the same size, you can reuse the density array without reallocating it. However, calls to cart_dmalloc() and cart_dfree() take very little time, so there is no serious performance penalty if you do delete and recreate the array every time.

Note that you should not delete the density array using a normal call to the standard function free(). If you do this, the memory used by the array will not be properly released, because of the slightly weird array format.

cart_makecart(): Everything is now set up to make our cartogram, which we do using the function cart_makecart(). The call is

  double *pointx;
  double *pointy;
  int npoints;
  int xsize,ysize;
  double blur;

  void cart_makecart(pointx,pointy,npoints,xsize,ysize,blur);
Here xsize and ysize are the size of the cartogram grid, as before, and pointx[] and pointy[] are arrays of doubles of length npoints that contain a set of points that you wish to transform according to the cartogram. That is, they are points in the coordinate space of the original density grid, which the program will transform onto the equivalent points on the cartogram. Thus, for instance, if xsize=1024 and ysize=512, then (pointx[0],pointy[0]) is a coordinate pair in the range (0-1024,0-512) that describes the position of the first point that you want to transform.

The final parameter blur measures the width in pixels of the Gaussian blur applied to the density data, as described in our paper. It is normally safe to set this parameter to zero, in which case you will get no blur and a cartogram that exactly reflects the density data you provided. However, you may also want to play with small positive values of this parameter (say between 0.1 and 20, depending on the size of your cartogram), which will result in a cartogram with less shape distortion, at the expense of lower accuracy in the sizes of areas.

When cart_makecart() returns – which will typically take anywhere from a few seconds to a minute or two, depending on the speed of your computer – pointx[] and pointy[] will have been overwritten with the new positions on the cartogram of the points.

Two typical uses of the code would be for pointx and pointy to contain (1) the vertices of a set of polygons describing boundaries or coastlines on a map of interest or (2) a complete grid of points in the map space, that will be transformed to provide an equivalent grid on the cartogram that can be used later for interpolating the transformation of any other point.

Note that the cartogram code is agnostic about what cartographic projection you use. It works entirely in the coordinates of the grid that you used to specify the density distribution. You can use whatever projection you like to create this density distribution, in which case the cartogram generated will use the same projection. You don't need to tell the code what projection you are using.

Also, people differ over what direction they like their y-axis to run in: does it increase down the page or up the page? In this code, either is fine. The program is again entirely agnostic. If you supply a density with y increasing down the page, the final cartogram will also have y increasing down the page, and similarly for y increasing up.

By default, the function cart_makecart() provides a primitive tally of its progress as it goes along, in the form of a simple ASCII-graphic "progress bar", similar to the conventional horizontal bars that indicate progress of programs in graphical environments like Windows. This is provided primarily for use in command-line programs such as the main.c supplied with this software. For other uses of the code you may wish to suppress the progress bar, which can be done by setting the preprocessor variable NOPROGRESS.

On the other hand, it may be of use for your program to know how far cart_makecart() has progressed, and for this purpose another simple interface is provided. If cart.c is compiled with the preprocessor variable PERCENT set, then the function will print out percentage completion figures to stdout as it goes along. When these figures reach 100, the calculation is finished. This option is primarily for programs that wish to capture this output and use it to provide their users with feedback on the progress of the calculation.

cart_freews(): Finally, when the cartogram has been calculated, you can delete its working space. You do this with the call

  int xsize,ysize;


Of course, if you don't do this, the space will be deleted for you anyway when the program terminates.

And that's it! You're done.

Description of the cart program

The file main.c contains a short wrapper program that calls the above functions to create a cartogram from supplied data. It is intended both as a demonstration of the use of the library functions and a fully functioning and usable – if simple – cartogram program in its own right. Here is a very brief description of its working.

The file main.c contains four functions. The first, readpop(), reads in a grid of population data from a specified stream. The second, creategrid(), creates a grid of regularly spaced points that the cartogram code will transform. The third, writepoints(), writes the positions of the points to a stream after they have been transformed.

The fourth and final function is the main() function that combines everything into a working program. It starts by reading the command-line arguments and opening the specified input and output files. Then it allocates workspace for the cartogram code using a call to cart_makews(). Then in allocates space for the density array using cart_dmalloc(), populates the array by reading from the input file using readpop(), and Fourier transforms the resulting array using cart_transform(). Then, to save space, it deletes the density array again using cart_dfree(), since it will never be needed again.

Once all this is accomplished, it is time to make the cartogram. This consists first of creating the list of points to be mapped onto the cartogram, which is done with a call to creategrid(), and then the cartogram transformation itself is accomplished with a single call to cart_makecart(). The resulting transformed set of points are written out again with writepoints(), and we free up memory and close input and output files before quitting.

Last modified: June 28, 2006

Mark Newman