What is Benchmarking?

Homogenisation algorithms are an essential part of climate data-processing to adjust for changepoints (abrupt and gradual) artificially introduced into the data through things like moving a station, changes to the type of screen used to shelter the instruments, change to the instrument type or recalibration or changes to observing/reporting methods.

On the global scale it is impossible to know when all of these changes occurred as they are rarely documented in a way that can be associated digitally alongside the data. There can sometimes be a bias introduced into the data through multiple changepoints occurring in the same direction - this could be misinterpreted as real climate change. Homogenisation algorithms try to detect these signals of change from the background noise and make appropriate adjustments. Most algorithms use comparisons with neighbouring stations for detection and adjustment. Large changes are relatively easy to detect. Small ones are very difficult. There is likely to be some seasonal pattern to these changepoints which further complicates their detection and adjustment. Some changes will be isolated to a single station, some will be network wide - also making their detection more difficult.

Given that there is only one world, and we do not know precisely what changepoints have occurred, when and where, we cannot fully understand how well our homogenisation algorithms are performing. Therefore, we cannot fully understand the uncertainty in climate data products due to changes made to the observing system. (NOTE: we are confident in large scale warming due to the wealth of independent evidence but uncertainty remains in the precisely how much and in the smaller scale details) By creating synthetic climate data that look and feel like real climate station data, with the same distribution over space and time, we can add realistic inhomogeneities and then test how good these homogenisation algorithms really are. This is benchmarking for climate data. This serves three purposes:

  • quantifying remaining uncertainty in any climate data-product due to missed inhomogeneities and incorrect adjustments (uncertainty may differ region to region depending on factors such as data sparsity, natural variability and regional climate change)
  • intercomparing different climate data-products on a level-ish playing field and improving assessments of fitness for specific purposes
  • homogenisation algorithm improvement

The figure below shows an example of how the multiple error-worlds of the benchmarks may be used alongside the ISTI databank by data-product creators for these purposes.

There are three stages to developing benchmarks:
  • creation of clean synthetic station data that are free from inhomogeneity (analog-known-worlds);
  • design and implementation of error-models exploring the various challenges of detecting and adjusting inhomgeneities through the varying nature of the background data and inhomogeneity type (analog-error-worlds);
  • and assessment of the ability of homogenisation algorithms to detect inhomogeneities correctly and return the data to its 'clean' form.

These will be dealt with by three task teams: Team Creation, Team Corruption and Team Validation respectively.