SAMSI/IMAGe summer program resources page

This page contains links to a variety of papers and resources which may prove useful to participants in the SAMSI/IMAGe/ISTI summer program event held in July 2014. This meeting benefitted from substantial support from SAMSI and IMAGe.


Data resources

The basic data holdings have been collated as part of the ISTI databank activities and can be found physically at ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/monthly/stage3/. Data is available in either ASCII or CF-compliant netcdf. There exist a variety of software options for reading netcdf files including Python, R and Java. The essential attributes to be aware of of these data are:
  • They are monthly resolution files
  • They are available as monthly means of max, min and average temperatures
  • Some have been quality controlled to remove gross outliers and a handful may have also had prior adjustments applied
  • Many are merged from disparate sets of holdings. Metadata around when the merge has been undertaken may serve to increase some marginal prior of the existence of a series break.
Many further details on these data holdings are available on the databank pages. Note that currently we are in a beta 4 release (June 4th) but we will be going to a full first version release before the workshop. This will alter some records compared to beta 4 but formats and essential data and metadata characteristics will be unchanged.

There will exist some benchmark analogs to at least some subset of the holdings by the workshop commencement. There is a discussion paper on this at http://www.geosci-instrum-method-data-syst-discuss.net/4/235/2014/gid-4-235-2014.html. These benchmarks will exactly mimic the data holdings in terms of data formats and data completeness. We will provide links when available to these resources.

There also exist a set of US benchmarks that were used in a previous study. If available they will be linked here.
 
Documentation resources

For some existing datasets there exists a range of documentation which may prove useful. Homepages for the current global datasets of land surface air temperature are available at:
  • CRUTEM4  UEA and UK Met Office (note that this is a 100 member ensemble which may be useful in considering UQ aspects)

The COST HOME project applied benchmarking to a broad suite of algorithms. Its homepage is at http://www.homogenisation.org/v_02_15/

The NIST algorithm, as it stood in 2013, is described in two papers by Pintar and colleagues in the ITS9 conference proceedings. Paper 1 , Paper 2

Pairwise algorithm DEVELOPMENT breaks returned

Note that this is very much development returned break locations and magnitudes shared here for convenience of participants. They should not be taken to infer anything about future products at this stage. Much work remains to be done still.

PHAv42i.FAST.MLY.GHCN4.tmax.rc6.conshf.zip
PHAv42i.FAST.MLY.GHCN4.tmin.rc6.conshf.zip
PHAv42i.FAST.MLY.GHCN4.tavg.rc6.conshf.zip
PHAv42i.FAST.MLY.GHCN4.tdtr.rc6.conshf.zip
 and the combined elements sorted by station & change date.
PHAv42i.FAST.MLY.GHCN4.all.rc6.conshf.zip

Relevant Fields:
1 - the size of the adjustment
2 - element
4 - station ID
6 & 7 - earlier segment begin/end date (end date == changepoint date)
9 & 10 - later segment begin/end date
16 - (next to the last) number of stations used to determine adjustment
NOTE: if > 100 indicates adjustment based upon station history change date + pairwise inhomogeneity detection. if == 100 indicates adjustment only found by checking station history date (no pha detection).

Case Studies

The following are case studies using a target station with relatively good data in its period of record, and a list of its neighbors within 500 km

link: http://tinyurl.com/samsi-workshop-cases  (The zip file is also provided at the bottom of the page)

Each file is one station (ID is the name of the text file). At the top is the metadata of the target station, and below is the metadata for all the neighbors. The format of the metadata follows the ISTI Metadata format that can be found here. For the neighboring stations, the distance from the target is also shown near the end of the line.

The current stations that have been chosen are as follows:

Canada
CA002400600 CAMBRIDGE_BAY_ARPT (24 neighbors)
CA003031093 CALGARY_INTL (1,352 neighbors)
CA005063075 WALKER_LAKE (164 neighbors)

United States
USW00024157 SPOKANE_INTL_AP (2,034 neighbors)
USW00023185 RENO_TAHOE_INTL_AP (1,312 neighbors)

Europe
SZ000001940 BASEL_BINNINGEN (365 neighbors)

Madagascar
MA000067083 ANTANANARIVO/IVATO (7 neighbors)

Seth has formatted the data that may be easier for some users. Location is here
http://www.narccap.ucar.edu/temp/samsi/

Each zipfile contains:

network.[region]: the original file listing all stations
within a given distance (500 km) of the target station.

station.data: id, lat, lon, elevation, and first and last timesteps for
each station in the region

timeseries: data files for each station in the region.  The data files
are named with the station ID, and each line in the file has year,
month, tmin, tmax, and tmean.  -99.99 indicates missing.

synoptic: data files for each timestep over the entire region.  The data
files are named by date (YYYYMM), and each line in the file has station
ID, tmin, tmax, and tmean.  -99.99 indicates missing.

Code resources

Climate dataset algorithms
  • CLIMAHOM / AnClim (ProClimDB) homepage (code appears to be DOS based)
See also http://www.meteobal.com/climatol/DARE/#Homogenization_packages

Statistical algorithms library

Relevant papers repository (with links to OA versions where available)

Review papers

Exposure, instrumentation, and observing practice effects on land temperature measurements, Blair Trewin, WIRES Climate Change, 2010, 1, DOI: 10.1002/wcc.46. pdf

Monthly correction


http://www.met.hu/en/omsz/kiadvanyok/idojaras/index.php?id=82 contains links to several methods papers as part of COST HOME. Including:
HOMER : a homogenization software – methods and applications
Olivier Mestre, Peter Domonkos, Franck Picard, Ingeborg Auer, Stéphane Robin, Emilie Lebarbier, Reinhard Böhm, Enric Aguilar, Jose Guijarro, Gregor Vertachnik, Matija Klancar, Brigitte Dubuisson, and Petr Stepanek

Domonkos, P. 2011: Adapted Caussinus-Mestre Algorithm for Networks of Temperature series (ACMANT). Int. J. Geosci, 2, 293-309, doi: 10.4236/ijg.2011.23032.

Caussinus, H. and Mestre, O.: Detection and correction of artificial shifts in climate series. Appl. Statist., 53, part 3, 405-425, DOI: 10.1111/j.1467-9876.2004.05155.x, 2004.

www.wmo.int/pages/prog/wcp/wcdmp/documents/WCDMP71.pdf
Szentimrey, T.: Development of MASH homogenization procedure for daily data. Proceedings of the fifth seminar for homogenization and quality control in climatological databases. Budapest, Hungary, 2006; WCDMP-No. 71, 123-130, 2008.

Menne, M. J., Williams, C. N. jr., and Vose, R. S.: The U.S. historical climatology network monthly temperature data, version 2. Bull. Am. Meteorol. Soc., 90, no.7, 993-1007, doi: 10.1175/2008BAMS2613.1, 2009.

Daily correction

http://onlinelibrary.wiley.com/doi/10.1002/joc.3530/abstract
Blair Trewin. A daily homogenized temperature data set for Australia. pdf
See also related technical document freely available at http://cawcr.gov.au/publications/technicalreports/CTR_049.pdf

Mestre, Olivier, Christine Gruber, Clémentine Prieur, Henri Caussinus, Sylvie Jourdain, 2011: SPLIDHOM: A Method for Homogenization of Daily Temperature Observations. J. Appl. Meteor. Climatol., 50, 2343–2358.
doi: http://dx.doi.org/10.1175/2011JAMC2641.1

Benchmarking

Venema et al. 2012 discusses benchmarking results for a range of algorithms: OA at http://www.clim-past.net/8/89/2012/cp-8-89-2012.html

Williams et al., 2012 discusses results of applying the US benchmarks to USHCN: Available at ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/algorithm-uncertainty/williams-menne-thorne-2012.pdf

Uncertainty estimation

Matthews et al., 2012 discusses outcomes of a SAMSI sponsored workshop on uncertainty quantification in climate data record construction. OA at http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-12-00042.1

Morice et al., 2012 outlines the uncertainty quantification for HadCRUT4. Available at http://hadobs.metoffice.com/hadcrut4/HadCRUT4_accepted.pdf

There are also examples for other datasets not directly land surface air temperatures but principals may be transferrable:

Mears et al., 2011 discuss uncertainty estimates for upper air temperatures from the satellite Microwave Sounding Unit. Available at http://images.remss.com/papers/rsspubs/Mears_JGR_2011_MSU_AMSU_Uncertainty.pdf . See also http://www.remss.com/measurements/upper-air-temperature#Uncertainty

Thorne et al., 2011 discusses uncertainty estimation, benchmarking and conditional probability recombination of estimation for radiosonde temperatures.
Kennedy et al., 2011 discuss in depth uncertainties in SST measurements. See http://hadobs.metoffice.com/hadsst3/part_1_figinline.pdf and http://hadobs.metoffice.com/hadsst3/part_2_figinline.pdf
Ċ
Peter Thorne,
Jun 20, 2014, 1:44 AM
Ċ
Peter Thorne,
Jun 20, 2014, 1:49 AM
ċ
PHAv42i.FAST.MLY.GHCN4.all.rc6.conshf.zip
(4850k)
Peter Thorne,
Jul 10, 2014, 7:41 AM
ċ
PHAv52i.FAST.MLY.GHCN4.tavg.rc6.conshf.zip
(1943k)
Peter Thorne,
Jul 10, 2014, 7:41 AM
ċ
PHAv52i.FAST.MLY.GHCN4.tdtr.rc6.conshf.zip
(2150k)
Peter Thorne,
Jul 10, 2014, 7:41 AM
ċ
PHAv52i.FAST.MLY.GHCN4.tmax.rc6.conshf.zip
(1705k)
Peter Thorne,
Jul 10, 2014, 7:41 AM
ċ
PHAv52i.FAST.MLY.GHCN4.tmin.rc6.conshf.zip
(2010k)
Peter Thorne,
Jul 10, 2014, 7:41 AM
Ċ
Peter Thorne,
Jul 3, 2014, 1:21 AM
Ċ
Peter Thorne,
Jul 3, 2014, 1:21 AM
ċ
RHtestsV4_20140528.r
(324k)
Peter Thorne,
Jun 26, 2014, 1:22 AM
Ċ
Peter Thorne,
Jun 26, 2014, 1:22 AM
Ċ
Peter Thorne,
Jun 26, 2014, 1:22 AM
ċ
RHtests_dlyPrcp_20140213.r
(163k)
Peter Thorne,
Jun 26, 2014, 1:22 AM
Ċ
Peter Thorne,
Jun 26, 2014, 1:22 AM
Ċ
Peter Thorne,
Jun 30, 2014, 4:44 AM
ċ
samsi-workshop-cases.zip
(199k)
Peter Thorne,
Jul 11, 2014, 5:26 AM
Comments