HOME BIO RESEARCH EVAN'S THESIS HOME PAGE



Trend validation of three high-resolution observational climate datasets

This study evaluated the extreme temperature trend accuracy of three high resolution climate datasets. The first dataset was the gridded observational Maurer et al. (2002) dataset (daily; 12 km resolution) which has made its way into in the climate community discussion. The second dataset was the Di Luzio et al. (2008) dataset (daily; 4 km) is essentially a daily version of the monthly PRISM dataset. The third dataset was the DAYMET (Thornton et al. 1997) dataset (daily; 1 km) which has a global spatial domain but has a short temporal domain (1980-1997). The reference climate dataset was the United States Historical Climate Network (USHCN) dataset (Menne et al. 2009) because of the quality of trends. The serially complete version 2 - monthly USHCN dataset is homogenized and bias corrected using an automated objective pairwise comparison method and is accepted within 99% of the climate community as representative of the trends of surface climate. The daily USHCN dataset (version 1) is quality controlled but not appropriate for trends nor serially complete. These two datasets were combined, using the methodology by Hamlet and Lettenmaier.

The gridded datasets are then interpolated to the locations of the USHCN stations. This should minimize or eliminate the bias of the interpolation methods, since these stations should be included in the network used to create the gridded high resolution datasets.



Time series of daily maximum temperature during the summer (M, J, J, A, S) months over the temporal span of the Maurer et al. (2002) dataset at the USHCN station (#052184) in (37.6742N, 106.3247W) Del Norte, Colorado. Here the USHCN v2-monthly data is used.
image

Preliminary analysis: discontinuities

Preliminary comparison between Maurer and the Di Luzio dataset time series (of daily maximum and minimum temperatures) with the USHCN v2 dataset time series has been undertaken to some extent. It was quickly apparent temporal discontinuities existed (above figure) in both the Maurer and DiLuzio datasets and for both Tmax and Tmin. Additional figures are available in the table below.

Dataset Daily Extreme Lat/lon USHCN ID link
Maurer Tmin 39.915N,79.7192W 369050 link
Maurer Tmin 34.1997N,82.1711W 383754 link
Maurer Tmax 43.6447N,94.4656W 212698 link
Maurer Tmax 39.6419N,78.7561W 182282 link
Maurer Tmax 33.4886N,80.8733W 386527 link
DiLuzio Tmin 35.7783N,106.6872W 294369 link
DiLuzio Tmin 46.6694N,94.1089W 216547 link
DiLuzio Tmax 46.6694N,94.1089W 216547 link
DiLuzio Tmax 41.0267N,86.5867W 129670 link
There were time series that looked identical, had a constant bias and time series with discontinuities. We are only concerned with time series with discontinuities and felt that there was enough stations with discontinuities to explore future.


Magnitude of decadal trends of differences in summer mean daily minimum (maximum) trends between the DiLuzio and the USHCN datasets at the USHCN station locations. The significance of those trends at the 0.05 level is indicated as well. The time period is over the duration of the Di Luzio dataset. The same maps for the Maurer daily minimum and maximum are available.


Preliminary analysis: spatial structure and sample means

The spatial structure of the differences in summer mean temperature trends between the Maurer/DiLuzio datasets and the USHCN v2 dataset do not exhibit substantial spatial structure (see figure above). Only USHCN stations with less than 10% of the data during the comparison period missing or flagged were used for this. These conclusions suggest the Maurer and DiLuzio datasets generally reproduce the USHCN spatial patterns. The Menne et al. (2009) paper describing the differences between raw and "adjusted" USHCN (v2 monthly) data, indicated that the adjustment process effectively removes an amount of spatial noise.

On a larger scale, preliminary results suggest that in general the datasets underestmate the reference dataset trend magnitudes. The table below dispalys the results of grouping all the aforementioned summer mean temperature trend magnitudes across the CONUS and then averaging them (i.e. not a true spatial average but a simple average of all locations in the above map(s)). In general, both the Maurer and Di Luzio datasets underestimate the trends in both daily extremes. Noteably, in all cases, more than half of the stations have significant trend biases.





Project hypthoses

Do the Maurer et al. (2002), Di Luzio et al. (2008) and DAYMET datasets exhibit the same CONUS average trends as the reference climate dataset? Do they reproduce the spatial structure of the reference dataset? Lastly, do the biases or uncertainties act as random spatial variables or are they a function of physical characteristics of the stations themselves (e.g. elevation)?




Thesis project methods

First a few study design aspects will be elaborated upon. For example, all three gridded datasets are during different time periods and have different lengths. The DAYMET dataset does not span the typical length of a climate base period even and so it was decided to use the 1981-2010 data from the USHCN dataset for determining each location and all four dataset's percentiles. Additionally this study focuses on summertime weather-that is, the time of year between May 15 - Sept. 15. Another noteable, the term "trend bias" refers to the linear trend of the difference each year from the reference dataset, and "trend accuracy" refers to the linear trend of the absolute difference each year between the two datasets. The significance of the all trends will be determined with a test statistical test such as the Mann Kendall test.

The analysis focuses on two kinds of trends: percentile exceedence and extreme heat event (EHE). Percentile exceedence trends are the trends in the number of instances the daily percentile exceeds the 90th percentile. These trends will be calcuated over the duration of the dataset being evaluated and at each location for both daily maximums and minimums. EHE trends include both the trends in the number of EHEs, and the mean duration of the EHEs. An EHE is required to have both daily maximum and minimum exceedences of the 90th percentile for at least two days and will continue as long as the running EHE-average percentile remains above 90.

Then the main analaysis progresses on four fronts. First the continental averages of the trend biases and accuracies will be calculated and assessed for all variables (Tmin percentile exceedences, mean EHE duration, etc.). Then the spatial structure of the trend bias and accuracy will be assessed by creating maps with the trend magnitudes and significance shown at all the locations. Next the biases and accuracies will be evaluated for both spatial autocorrelation and correlation between datasets. Lastly, the relationships between trend biases and accuracies (at each station) and the physical characteristics (elevation, distance to water, etc.) of the stations will be undertaken.





Project analysis so far

A foremost question was how many stations could be used to compare the four datasets. Above is shown the available locations for evaluation of each dataset. The availability was determined at each sataion by the availability of the data from the USHCN v1 daily dataset, the surfacestations.org project rating assigned to each station, and the originality of the USHCH v2 monthly data. The locations of gridded dataset evaluation are located at these NWS co-operative observing network stations which have such robust data within the USHCN network. Since the gridded datasets have different temporal domains (1949-2010, Maurer; 1960-2001, Di Luzio; 1980-2008, DAYMET) the sets of stations which passed requirements changed and currently the project will move forward with the three seperate sets of stations. The DAYMET evaluation sample is 243 strong, the Maurer sample is 293 and the Di Luzio sample is 324.