Methods

Brian Min and Zachary O'Keeffe
University of Michigan

This project leverages high resolution satellite data to generate estimates of electrification access at a higher spatial resolution than ever before. The estimates are derived from the first long-scale analysis of the complete archive of nighttime light imagery from the VIIRS sensor, from 2012–present, tracked against new population estimates based on computer vision techniques identifying all human-built settlement structures.

Over the last two decades, a groundswell of research has demonstrated how satellite imagery of nighttime lights can be used to detect the consumption of electricity at national and sub-national levels. Prior studies have demonstrated that nighttime satellites can detect light output originating from cities, fires, gas flares, and heavily lit fishing boats.[1] Several studies have also shown that nighttime light output strongly correlates with electricity generating capacity and economic activity at the regional and national levels.[2] Some recent research has identified the ways in which the delivery of electrical power can be politically motivated.[3] While most of these studies have been conducted in the industrialized world or in urban environments, our recent research has demonstrated that nighttime light data can be usefully used to detect lower levels of electricity use and outdoor lighting in the developing world.[4]

As the result of a collaboration between the University of Michigan, World Bank, and National Oceanic Atmospheric Administration, we have produced the Light Every Night dataset, a complete archive of all nighttime imagery captured by VIIRS across the globe, from April 2012 to the present, and DMSP from 1992-2013. This massive archive, comprising over 250TB of data across millions of files, describe visible band brightness, cloud cover, and associated metadata from suborbits captured across the globe every night.

The HREA project leverages the complete record of VIIRS data. Launched aboard the Suomi National Polar-orbiting Partnership (SNPP) satellite in 2011, VIIRS provides dramatically increased precision and accuracy of nighttime light measurement compared to DMSP. The VIIRS nighttime sensor (or day-night band) captures data at a high spatial resolution (750m) and is fully calibrated to accurately record luminosity levels.

In order to use light output as a useful indicator of electricity use, we need data on the location of human settlements. However, in most countries, there are significant limitations on the availability, accuracy, and reliability of population data. We rely on new computer-generated data on built-up areas. These efforts rely on machine learning and computer vision techniques to identify the outline of every building or cluster of buildings within a country. The building outlines are then georeferenced and, in some cases, linked with census tract population estimates to yield high resolution settlement maps that can be directly compared against other georeferenced data.[1]

We link VIIRS data to high resolution settlement maps along a constant spatial grid. This provides a spatially constant reference on which to map the nightly values based on cell centroid coordinates, and with which to intersect the settlement polygons. In other words, we identify the nightly radiance values that correspond to the point coordinates in the middle of each cell and identify the settlement shapes that fit within the quadrilaterals of each cell.

Each night, we compare the light output of each settlement against background brightness measured from comparable isolated, uninhabited pixels. We then estimate the statistical confidence that the settlement is brighter than the background, and repeat the process for every settlement within a country. We then aggregate the nightly estimates across the calendar year to generate annual measures for each settlement of likelihood lit, proportion of nights lit, and statistically recalibrated light output intensity.

Computational Image Processing

Every night, the VIIRS DNB sensor collects data on the observed brightness over all locations within a country, including over electrified and unelectrified areas, and populated and unpopulated areas. Our objective is to classify populated areas as electrified or not using all the brightness data over a country. But the challenge is that light output can be due to multiple sources unrelated to electricity use. Notably, the VIIRS sensor is so sensitive that it picks up light from overglow, atmospheric interactions, moonlight, and due to variations in surface reflectivity across types of land cover. We refer collectively to these exogenous sources as background noise, which must be accounted for to classify whether an area is brighter than expected on any given night.

Radiance levels are recorded on all nights since April 2012. Values are subsequently dropped if they are considered low-quality by NOAA: a) they are obstructed by clouds; b) they are sunlit, outside the nighttime cutoff zone (i.e., below the solar zenith angle 101°); c) they are moonlit, with lunar illumination above .0005 lux (or .001 lux for regression based estimates); d) high energy particles were detected; e) they are obstructed by stray light (solar zenith angle at nadir between 90–118.5°); f) surface lightning was detected; or g) gas flares were detected (temperature > 1200 K and frequency > 1%). If more than one high-quality value was observed during a single night, the value recorded earliest in the night was kept.

Because the radiance values are heavily right-skewed (i.e., there are some extremely large positive values, relative to the average), and some are slightly negative (the technical minimum is -1.5), we add 2.5 to the remaining values and apply the natural logarithm when generating averages, as NOAA does. To generate annual estimates, the average of all good quality nightly values for each pixel in each calendar year is calculated.

We use data on light output detected over areas with no settlements or buildings to train a statistical model of background noise. The model can be used to generate an expected brightness value on every given night for every given location. We then compare the observed brightness on each night against the expected baseline brightness value. Areas with human settlements with brighter light output than expected are assumed to have access to electricity on that night. We classify all settlements on all nights and then aggregate the estimates to generate a “likelihood electrified” estimate for each calendar year for all settlement areas. Areas that are much brighter than would be expected on most nights have the highest probability of being electrified. Areas that are as dim as areas with no settlements have the lowest probability of being electrified. And areas that are a little brighter on some nights have middle levels of probability. The advantage of this process is that it fully uses all available nightly data from the VIIRS data stream while taking into account known and unknown sources of data noise and variability. The process also generates probability estimates that allow for the identification of areas where the likelihood of electricity access and use is most uncertain. This is significant given that traditional binary measures of access do not account for variations in levels of use or reliability of power supply, even across areas that are all nominally electrified.

1) Select random sample of locations with no settlements to measure background noise.

We select a stratified random sample of isolated non-settlement 15 arc-second pixels to use in the regression. We define a 15 arc-second non-settlement pixel as one which contains no 1 arc-second settlement pixels. We define an isolated non-settlement pixel as one for which none of its 8 neighboring 15 arc-second cells contains settlement pixels either. Thus, these pixels should be relatively far from artificial sources of light. We select a random sample of such pixels stratified by land cover type.

2) Select observations

Following NOAA guidelines and their data quality flags, we drop bad quality data, including those with heavy cloud cover and excessive sensor noise. NOAA also drops many nights with high lunar illumination. We relax this threshold slightly (from .0005 to .001 lux), thus keeping additional observations with modest lunar illumination to preserve more data. Furthermore, on nights with multiple overpasses, we use data with the earliest local timestamp for settlement points, but allow multiple observations for non-settlement points.

3) Remove outliers

To generate a reliable estimate of background noise, we need to exclude outliers. Presumably, an unusually high brightness value in an unsettled area is not due to background noise but rather due to external, non-systematic phenomena. First, by country and year, we calculate the mean and standard deviation of radiance for each isolated 15 arc-second non-settlement pixel. Then, by land cover type, we exclude points which have means or standard deviations below the 1st or above the value equal to the median plus the difference between the 1st and 50th percentile (this is more robust than using the 99th percentile). After selecting the random sample of points to keep, we proceed to remove individual outlier observations. First, we drop observations that have brightness values four standard deviations above the median logged radiance value. Next, by date and land cover type, we drop observations that are four standard deviations above the mean radiance value for that night and land cover pair.

4) Create statistical model of background noise

For each calendar year, we run a linear mixed effects model on light output for selected pixels in areas with no settlements. The aim is to understand the exogenous factors that explain variation in light levels for areas where there are no human settlements, and presumably no electricity. The model includes observations from a selection of isolated non-settlement pixels from all good quality nights, and includes fixed controls for month, land type, lunar illumination, exact local time, and the interaction between land cover type and lunar illumination, as well as a date random effect. Notably, the regression diagnostics are excellent with strong linearity, few outliers, and limited heteroskedasticity. Using these statistical parameters learned from data on non-settlement areas, we then calculate the expected level of light output for all areas with settlements. These predicted values represent a counterfactual estimate of how much light would be expected on that specific day on that type of land, if the only sources of light were from background noise and other exogenous factors. Areas with higher observed light output than expected light output will be assumed to have electricity access.

5) Identify electrified settlement areas on each night

We compare the actual observed level of light output against the expected light output level from the model above for every settlement pixel on every night. This difference in the observed versus expected light output is our measure of anthropogenic light generation on each night. We standardize these values by dividing by the standard deviation of residuals to generate z-scores for each pixel on each night. Higher z-scores imply higher light output than expected due to exogenous factors alone (i.e. non-human factors like land type, lunar illumination, etc.). The key assumption is that higher scores indicate higher likelihood that a settlement is using electricity on that specific night.

6) Aggregate nightly estimates to generate “Likelihood Electrified” and “Proportion of Nights Lit” values for all settlement areas for each year

We use the mean of all nightly z-scores for each settlement to generate probability of electrification values for all 15 arcsecond pixels with settlements across the country. We repeat the process for all years of VIIRS data. An alternative product instead calculates the proportion of nights for which the standardized residuals is above a given threshold for a particular pixel, thus generating an estimate of the proportion of nights for which the settlement had visible artificial light.


Limitations

Overpass times are in the early morning.

Urban classifications are highly impacted by diffusion and overglow.

Settlement classifications are clustered at the 15 arcsecond level. Currently working on sub-pixel classification methods.

Settlement maps are produced for specific time periods and may not reflect recent changes.

References

[1] Croft, T.A., 1978, "Night-time Images of the Earth From Space", Scientific American, 239, 68-79. Elvidge, C.D., K.E. Baugh, E.A. Kihn., H.W. Kroehl, E.R. Davis, and C. Davis. 1997. Relation between satellite observed visible-near infrared emissions, population, economic activity, and power consumption. International Journal of Remote Sensing 18(6): 1373–1379. [2] De Souza Filho, C.R., Zullo, J. Jr., Elvidge, C., 2004, "Brazil's 2001 energy crisis monitored from space", International Journal of Remote Sensing, 25(12), 2475-2482; Doll C.N.H., Muller J.-P., Morley J.G., 2006, "Mapping regional economic activity from night-time light satellite imagery", Ecological Economics, 57, 75-92; Sutton, P. C., C. D. Elvidge and T. Ghosh, 2007, "Estimation of gross domestic product at sub-national scales using nighttime satellite imagery", International Journal of Ecological Economics and Statistics, 8 (SO7), 5 – 21; Ghosh, T., Powell, R., Elvidge, C. D., Baugh, K. E., Sutton, P. C., & Anderson, S., 2010, "Shedding light on the global distribution of economic activity", The Open Geography Journal , 3, 148-161; Henderson, J. Vernon & Adam Storeygard & David N. Weil, 2012. "Measuring Economic Growth from Outer Space," American Economic Review, American Economic Association, vol. 102(2), pages 994-1028. [3] Min, B. 2015. Power and the Vote: Elections and Electricity in the Developing World. New York: Cambridge University Press; Min, B. and M. Golden. 2014. Electoral cycles in electricity losses in India. Energy Policy 65: 619–625; Baskaran, T., Min, B., & Uppal, Y. (2015). Election cycles and electricity provision: Evidence from a quasi-experiment with Indian special elections. Journal of Public Economics, 126, 64-73. [4] Min and Gaba, 2014, “Tracking Electrification in Vietnam Using Nighttime Lights”, Remote Sensing. 6(10):9511–9529. 2014; Min, Gaba, Sarr, and Agalassou, 2013, “Detection of Rural Electrification in Africa using DMSP-OLS Night Lights Imagery”, International Journal of Remote Sensing 34(22):8118–8141. 2013. Gaba,Kwawu Mensan; Min,Brian; Veerman,Olaf; Baugh,Kimberly. 2019. Mainstreaming Disruptive Technologies in Energy. Washington, D.C. : World Bank Group.