Homogenisation algorithms are an essential part of climate data-processing to adjust for changepoints (abrupt and gradual) artificially introduced into the data through things like moving a station, changes to the type of screen used to shelter the instruments, change to the instrument type or recalibration or changes to observing/reporting methods.
On the global scale it is impossible to know when all of these changes occurred as they are rarely documented in a way that can be associated digitally alongside the data. There can sometimes be a bias introduced into the data through multiple changepoints occurring in the same direction - this could be misinterpreted as real climate change. Homogenisation algorithms try to detect these signals of change from the background noise and make appropriate adjustments. Most algorithms use comparisons with neighbouring stations for detection and adjustment. Large changes are relatively easy to detect. Small ones are very difficult. There is likely to be some seasonal pattern to these changepoints which further complicates their detection and adjustment. Some changes will be isolated to a single station, some will be network wide - also making their detection more difficult.
Given that there is only one world, and we do not know precisely what changepoints have occurred, when and where, we cannot fully understand how well our homogenisation algorithms are performing. Therefore, we cannot fully understand the uncertainty in climate data products due to changes made to the observing system. (NOTE: we are confident in large scale warming due to the wealth of independent evidence but uncertainty remains in the precisely how much and in the smaller scale details) By creating synthetic climate data that look and feel like real climate station data, with the same distribution over space and time, we can add realistic inhomogeneities and then test how good these homogenisation algorithms really are. This is benchmarking for climate data. This serves three purposes:
The figure below shows an example of how the multiple error-worlds of the benchmarks may be used alongside the ISTI databank by data-product creators for these purposes.
These will be dealt with by three task teams: Team Creation, Team Corruption and Team Validation respectively.