This page contains links to a variety of papers and resources which may prove useful to participants in the SAMSI/IMAGe/ISTI summer program event held in July 2014. This meeting benefitted from substantial support from SAMSI and IMAGe. The basic data holdings have been collated as part of the ISTI databank activities and can be found physically at ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/monthly/stage3/. Data is available in either ASCII or CF-compliant netcdf. There exist a variety of software options for reading netcdf files including Python, R and Java. The essential attributes to be aware of of these data are:
Many further details on these data holdings are available on the databank pages. Note that currently we are in a beta 4 release (June 4th) but we will be going to a full first version release before the workshop. This will alter some records compared to beta 4 but formats and essential data and metadata characteristics will be unchanged. There will exist some benchmark analogs to at least some subset of the holdings by the workshop commencement. There is a discussion paper on this at http://www.geosci-instrum-method-data-syst-discuss.net/4/235/2014/gid-4-235-2014.html. These benchmarks will exactly mimic the data holdings in terms of data formats and data completeness. We will provide links when available to these resources. There also exist a set of US benchmarks that were used in a previous study. If available they will be linked here.
Pairwise algorithm DEVELOPMENT breaks returned Note that this is very much development returned break locations and magnitudes shared here for convenience of participants. They should not be taken to infer anything about future products at this stage. Much work remains to be done still. PHAv42i.FAST.MLY.GHCN4.tmax.rc6.conshf.zip PHAv42i.FAST.MLY.GHCN4.tmin.rc6.conshf.zip PHAv42i.FAST.MLY.GHCN4.tavg.rc6.conshf.zip PHAv42i.FAST.MLY.GHCN4.tdtr.rc6.conshf.zip and the combined elements sorted by station & change date. PHAv42i.FAST.MLY.GHCN4.all.rc6.conshf.zip Relevant Fields: 1 - the size of the adjustment 2 - element 4 - station ID 6 & 7 - earlier segment begin/end date (end date == changepoint date) 9 & 10 - later segment begin/end date 16 - (next to the last) number of stations used to determine adjustment NOTE: if > 100 indicates adjustment based upon station history change date + pairwise inhomogeneity detection. if == 100 indicates adjustment only found by checking station history date (no pha detection). Case Studies The following are case studies using a target station with relatively good data in its period of record, and a list of its neighbors within 500 km link: http://tinyurl.com/samsi-workshop-cases (The zip file is also provided at the bottom of the page) Each file is one station (ID is the name of the text file). At the top is the metadata of the target station, and below is the metadata for all the neighbors. The format of the metadata follows the ISTI Metadata format that can be found here. For the neighboring stations, the distance from the target is also shown near the end of the line. The current stations that have been chosen are as follows: Canada CA002400600 CAMBRIDGE_BAY_ARPT (24 neighbors) CA003031093 CALGARY_INTL (1,352 neighbors) CA005063075 WALKER_LAKE (164 neighbors) United States USW00024157 SPOKANE_INTL_AP (2,034 neighbors) USW00023185 RENO_TAHOE_INTL_AP (1,312 neighbors) Europe SZ000001940 BASEL_BINNINGEN (365 neighbors) Madagascar MA000067083 ANTANANARIVO/IVATO (7 neighbors) Seth has formatted the data that may be easier for some users. Location is here http://www.narccap.ucar.edu/temp/samsi/ Each zipfile contains: network.[region]: the original file listing all stations within a given distance (500 km) of the target station. station.data: id, lat, lon, elevation, and first and last timesteps for each station in the region timeseries: data files for each station in the region. The data files are named with the station ID, and each line in the file has year, month, tmin, tmax, and tmean. -99.99 indicates missing. synoptic: data files for each timestep over the entire region. The data files are named by date (YYYYMM), and each line in the file has station ID, tmin, tmax, and tmean. -99.99 indicates missing. Code resources Climate dataset algorithms
Statistical algorithms library Relevant papers repository (with links to OA versions where available) Review papers Exposure, instrumentation, and observing practice effects on land temperature measurements, Blair Trewin, WIRES Climate Change, 2010, 1, DOI: 10.1002/wcc.46. pdf Monthly correction http://www.met.hu/en/omsz/kiadvanyok/idojaras/index.php?id=82 contains links to several methods papers as part of COST HOME. Including: HOMER : a homogenization software – methods and applications Olivier Mestre, Peter Domonkos, Franck Picard, Ingeborg Auer, Stéphane Robin, Emilie Lebarbier, Reinhard Böhm, Enric Aguilar, Jose Guijarro, Gregor Vertachnik, Matija Klancar, Brigitte Dubuisson, and Petr Stepanek Domonkos, P. 2011: Adapted Caussinus-Mestre Algorithm for Networks of Temperature series (ACMANT). Int. J. Geosci, 2, 293-309, doi: 10.4236/ijg.2011.23032. Caussinus, H. and Mestre, O.: Detection and correction of artificial shifts in climate series. Appl. Statist., 53, part 3, 405-425, DOI: 10.1111/j.1467-9876.2004.05155.x, 2004. www.wmo.int/pages/prog/wcp/wcdmp/documents/WCDMP71.pdf Szentimrey, T.: Development of MASH homogenization procedure for daily data. Proceedings of the fifth seminar for homogenization and quality control in climatological databases. Budapest, Hungary, 2006; WCDMP-No. 71, 123-130, 2008. Menne, M. J., Williams, C. N. jr., and Vose, R. S.: The U.S. historical climatology network monthly temperature data, version 2. Bull. Am. Meteorol. Soc., 90, no.7, 993-1007, doi: 10.1175/2008BAMS2613.1, 2009. Daily correction http://onlinelibrary.wiley.com/doi/10.1002/joc.3530/abstract Blair Trewin. A daily homogenized temperature data set for Australia. pdf See also related technical document freely available at http://cawcr.gov.au/publications/technicalreports/CTR_049.pdf Mestre, Olivier, Christine Gruber, Clémentine Prieur, Henri Caussinus, Sylvie Jourdain, 2011: SPLIDHOM: A Method for Homogenization of Daily Temperature Observations. J. Appl. Meteor. Climatol., 50, 2343–2358. doi: http://dx.doi.org/10.1175/2011JAMC2641.1 Benchmarking Venema et al. 2012 discusses benchmarking results for a range of algorithms: OA at http://www.clim-past.net/8/89/2012/cp-8-89-2012.html Williams et al., 2012 discusses results of applying the US benchmarks to USHCN: Available at ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/algorithm-uncertainty/williams-menne-thorne-2012.pdf Uncertainty estimation Matthews et al., 2012 discusses outcomes of a SAMSI sponsored workshop on uncertainty quantification in climate data record construction. OA at http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-12-00042.1 Morice et al., 2012 outlines the uncertainty quantification for HadCRUT4. Available at http://hadobs.metoffice.com/hadcrut4/HadCRUT4_accepted.pdf There are also examples for other datasets not directly land surface air temperatures but principals may be transferrable: Mears et al., 2011 discuss uncertainty estimates for upper air temperatures from the satellite Microwave Sounding Unit. Available at http://images.remss.com/papers/rsspubs/Mears_JGR_2011_MSU_AMSU_Uncertainty.pdf . See also http://www.remss.com/measurements/upper-air-temperature#Uncertainty Thorne et al., 2011 discusses uncertainty estimation, benchmarking and conditional probability recombination of estimation for radiosonde temperatures. Kennedy et al., 2011 discuss in depth uncertainties in SST measurements. See http://hadobs.metoffice.com/hadsst3/part_1_figinline.pdf and http://hadobs.metoffice.com/hadsst3/part_2_figinline.pdf |