The ensemble of daily predictor variables developed from the CanESM2 CMIP5 experiments
On this page
- Description of input data
- Data origin
The main goal of this project was to develop the ensemble of daily predictors using output variables from the CanESM2 experiments as prepared for the Coupled Model Intercomparison Project Phase 5 (CMIP5). The ensemble includes the 25 basic predictor variables as well as total precipitation amount at the daily scale (see below). They are defined on the global Gaussian reduced grid associated with spectral truncation T42 that consists of 128x64 grid cells in longitude-latitude direction. For the purposes of model calibration and validation, the 26 predictors having the same characteristics as those based on the CanESM2 output are developed from the NCEP/NCAR reanalysis.
A secondary objective was to design a fast and cost-effective approach to develop the predictors using global time series of data as stored in NetCDF format. A short description of the data sources is available in section 1. The methodology of the development and organization of predictor files is available in section 2.
1. Description of input data
1.1 CanESM2 global climate model
The second generation of Earth System Model CanESM2 (von Salzen et al., 2005; Li and Barker, 2005; Barker et al., 2005) is the fourth generation of the coupled global climate model (i.e. CGCM4) developed by the Canadian Centre for Climate Modelling and Analysis (CCCma) of Environment and Climate Change Canada. CanESM2 represents a part of the Canadian modelling community's contribution to the IPCC Fifth Assessment Report (AR5). Main components of the Earth System Model include: (i) Atmospheric General Circulation Model (AGCM4) having triangular truncation resolution T63 with the hybrid vertical domain expanded in 35 vertical layers; (ii) Ocean GCM4 developed from the NCAR CSM Ocean Model and defined by 256x192 horizontal resolution and 40 vertical layers; (iii) CanSim1 sea-ice model; and (iv) Canadian Land Surface Scheme (CLASS2.7) and CTEM1 for land processes. For more details on CanESM2 and its components, please visit CCCma’s webpage.
For the purposes of the development of the suggested set of predictors, the atmospheric variables (see section 2.1) from the first member run (r1i1p1) have been selected. The past climate conditions over the 1961-2005 period are represented by the historical simulation. As for the projections of future changes in climate based on the period 2006-2100, new scenarios developed for the IPCC AR5 are introduced. The emissions, concentrations, and land-cover change projections are described by the Representative Concentration Pathways scenarios such as RCP2.6, RCP4.5 and RCP8.5 (Moss et al., 2010; Meinshausen et al. 2011).
Since CanESM2 is a member of the CMIP5 project, atmospheric variables issued from aforementioned runs are available on the T42 (see Figure 1) instead of T63 grid projection. Spatial interpolation from T63 to T42 grid has been performed and made available online by CCCma. Since daily GCM data are required for the construction of predictors, the upper-level fields were selected as daily mean over four time steps. The near-surface fields have already been defined as daily variables. For access to available atmospheric fields as simulated by CanESM2 AGCM4, please visit CCCma's page on climate model graphics.
1.2 NCEP/NCAR Reanalysis 1
The NCEP/NCAR Reanalysis Project 1 (Kalnay et al., 1996) is a result of collaboration between National Centers for Environmental Prediction (NCEP) and National Center for Atmospheric Research (NCAR). These global atmospheric reanalysis datasets are based on the state-of-art assimilation/forecast system of high-density historical observations (from 1948 to present).
To keep things consistent with the raw atmospheric variables available from CanESM2 outputs, the same variables were selected from NCEP/NCAR. The historical reference time frame ranges from the years 1961 to 2005. The variables defined at pressure levels as well as mean sea level pressure were available on the global regular latitude-longitude grid projection of 2.5 degrees. However, 2m air temperature and total precipitation were available on the finer spatial resolution, global Gaussian T62 grid projection (94 latitudes x 192 longitudes). Corresponding latitudes have been defined in an inverse order, north to south, compared to the CanESM2 output grid.
Visit NCEP/NCAR to access the NCEP/NCAR Reanalysis project 1 results and available variables.
2.1 Raw and derived variables
Global daily time series of variables described as raw data in Table 1 are provided from both data sources (CanESM2 and NCEP/NCAR) in NetCDF format. Each dataset has undergone first only minimal preprocessing to ensure consistency in structure of the input files during standardization and extraction to grid boxes. An ensemble of codes created for data manipulation and selection of predictors was executed on Linux system in Bourne Advanced Shell (bash) environment. Data interpolation and predictor selection are based on the NCAR Command Language (NCL) version 6.1.2. NCL is an interpreted language developed by NCAR and designed specifically for scientific data analysis and visualization.
For the CanESM2 outputs, preprocessing consists of:
- Separating multi-yearly to yearly files;
- Selecting upper-air variables from desired pressure levels (1000, 850 and 500 hPa);
- Adjusting values in time arrays so that time monotonically increases following Julian days as needed for later selection of gridded data per grid box (section 2.2);
- Converting variables to double precision, and Kelvin to degrees Celsius (for temperature).
The NCEP/NCAR raw variables were also available in NetCDF format. Preprocessing of reanalysis raw data includes spatial interpolation to the global Gaussian grid T42 of CanESM2 after level selection for the upper-air variables. For that purpose, the NCL’s built-in function f2gsh_Wrap was applied. The function is based on spherical harmonics and interpolates a scalar field from a fixed (i.e. regular latitude-longitude grid) to a Gaussian grid. Spatial coverage of NCEP/NCAR data is considered as fixed or regular grid because the distance between two consecutive grid points is the same in the latitude/longitude direction.
Therefore, the second part of predictor development consists of computing the air flow variables from zonal and meridional wind components defined on the T42 Gaussian grid. The calculation is based on the NCL built-in functions as identified in Table 1. Here, divergence and relative vorticity are based on true wind since this variable is available from both data sources.
Table 1: Basic description of predictor variables derived from raw data of CanESM2 and NCEP/NCAR.
|Mean sea level pressure||Pa||Mean sea level||Raw|
|Specific humidity||kg/kg||Pressure levels||Raw|
|Geopotential height||m||Pressure levels||Raw|
|Zonal wind||m/s||Pressure levels||Raw|
|Meridional wind||m/s||Pressure levels||Raw|
|Wind speed1||m/s||Pressure levels||-|
|Wind direction1,2||0-360°||Pressure levels||wind_direction|
|Relative vorticity1||Pressure levels||uv2vrG_Wrap|
1Derived using NCL function. 2Wind direction (calculated from U and V components) in degrees corresponds to: 0° pointing north, 90° pointing east, 180° pointing south and 270° pointing west.
2.2 Characteristics of grid-box directories
The 26 variables are first standardized against the corresponding historical reference period (for each source of data, i.e. with respect to CanESM2 baseline for CanESM2 predictors and with respect to NCEP/NCAR baseline for NCEP/NCAR predictors) and then organized per grid box into 1-column text files per data source per variable. In this case, standardization of the global daily time series is based on long-term climate mean and its standard deviation over the 1971 to 2000 reference period. Then, except for wind direction, all predictor values (x), for both the historical and future periods, have been normalised (n) with respect to the means (µ) and standard deviations () of the 1971-2000 reference period using the following expression:
The name of the five directories containing each grid box for the baseline and future periods from both NCEP/NCAR and CanESM2 are listed in Table 2. Note that the corresponding RCP scenario for each CanESM2 run is stated before the information of the time window (i.e. rcp26 for RCP2.6 scenario, rcp45 for RCP4.5 scenario, and rcp85 for RCP8.5 scenario).
2.2.1 Structure of predictor files
Long-term time series of standardized daily values (and wind direction) are then extracted into a one column text file per grid cell (box). The 128x64 grid cells cover the global domain according to T42 Gaussian grid. This grid is uniform along the longitude with horizontal resolution of 2.8125° and is nearly uniform along the latitude of roughly 2.8125° (see Table 3). The predictors associated with each grid cell are represented by corresponding folder named BOX_iiiX_jjY, where iii=1,128 is the longitudinal index and jj=1,64 is the latitudinal index. The structure of these folders is described in Table 2. The predictors organized in this way are ready for use as input in the statistical downscaling models such as ASD or SDSM.
Table 2: Structure of a BOX_iiiX_jjY folder: the two auxiliary text files and five sub-folders where each are containing the 26 predictor files issued from selected source of data.
|Two auxiliary files providing details on a given grid cell (see Figure 1)|
|gauss42_sftlf.txt||Land-area fraction [%]||0 % if oceanic and Great Lakes grid points; 100% if land and small inland|
|gauss42_orog.txt||Orography [m]||Surface altitude|
|Five sub-folders for data type||Time frame||The 4-characters acronym|
|NCEP-NCAR_1961_2005||1961 to 2005||ncep|
|CanESM2_historical_1961_2005||1961 to 2005||cesh|
|CanESM2_rcp26_2006_2100||2066 to 2100||ces2|
|CanESM2_rcp45_2006_2100||2066 to 2100||ces4|
|CanESM2_rcp85_2006_2100||2066 to 2100||ces8|
Each sub-folder contains the 26 files of predictors. A filename consists of 10 characters with extension .dat. The list of filenames is given in Table 4: The P* points to a 4-character prefix as given in Table 2, then predictor name is identified using the 5th to 8th character, and gl stands for global grid.
Table 3: Latitude / longitude cell numbering for the Gaussian 128x64 grid: The latitudes are numerated from south to north and represent grid index associated with Y (64 values). The longitudes are numerated from the Greenwich meridian toward east (associated with 128 X values).
|N° of Y and its corresponding latitude|
|N° of X(iii)||Longitude(°East)|
Table 4: List of the 26 predictor filenames and their corresponding variable names. The prefix P* is defined in Table 2.
|No||File name||Predictor names or variables|
|1||P*mslpgl.dat||Mean sea level pressure|
|2||P*p1_fgl.dat||1000 hPa Wind speed|
|3||P*p1_ugl.dat||1000 hPa Zonal wind component|
|4||P*p1_vgl.dat||1000 hPa Meridional wind component|
|5||P*p1_zgl.dat||1000 hPa Relative vorticity of true wind|
|6||P*p1thgl.dat||1000 hPa Wind direction|
|7||P*p1zhgl.dat||1000 hPa Divergence of true wind|
|8||P*p500gl.dat||500 hPa Geopotential|
|9||P*p5_fgl.dat||500 hPa Wind speed|
|10||P*p5_ugl.dat||500 hPa Zonal wind component|
|11||P*p5_vgl.dat||500 hPa Meridional wind component|
|12||P*p5_zgl.dat||500 hPa Relative vorticity of true wind|
|13||P*p5thgl.dat||500 hPa Wind direction|
|14||P*p5zhgl.dat||500 hPa Divergence of true wind|
|15||P*p850gl.dat||850 hPa Geopotential|
|16||P*p8_fgl.dat||850 hPa Wind speed|
|17||P*p8_ugl.dat||850 hPa Zonal wind component|
|18||P*p8_vgl.dat||850 hPa Meridional wind component|
|19||P*p8_zgl.dat||850 hPa Relative vorticity of true wind|
|20||P*p8thgl.dat||850 hPa Wind direction|
|21||P*p8zhgl.dat||850 hPa Divergence of true wind|
|23||P*s500gl.dat||500 hPa Specific humidity|
|24||P*s850gl.dat||850 hPa Specific humidity|
|25||P*shumgl.dat||1000 hPa Specific humidity|
|26||P*tempgl.dat||Air temperature at 2 m|
- CanESM2 output data provided by the CCCma, ECCC: http://climate-modelling.canada.ca/climatemodeldata/data.shtml
- NCEP Reanalysis data provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA
Barker, H. W. and J. N. S. Cole and J.-J. Morcrette and R. Pincus and P. Raisanen and K. von Salzen and P. A. Vaillancourt, 2008: The Monte Carlo Independent Column Approximation: An assessment using several global atmospheric models. Quart. J. Roy. Meteorol. Soc., 134, 1463-1478.
Hessami, M., Gachon, P., Ouarda, T.B.M.J. and St-Hilaire, A., 2008: Automated regression-based Statistical Downscaling tool, Environmental Modelling and Software, 23, 813-834.
Kalnay et al., 1996 : The NCEP/NCAR 40-year reanalysis project, Bull. Amer. Meteor. Soc., 77, 437-470.
Li, J. and H. W. Barker, 2005: A radiation algorithm with correlated k-distribution. Part I: local thermal equilibrium. Journal of Atmospheric Science, Journal of Atmospheric Science, 62, 286-309.
Meinshausen, M., S. J. Smith, K. V. Calvin, J. S. Daniel, M. L. T. Kainuma, J.-F. Lamarque, K. Matsumoto, S. A. Montzka, S. C. B. Raper, K. Riahi, A. M. Thomson, G. J. M. Velders and D. van Vuuren(2011). "The RCP Greenhouse Gas Concentrations and their Extension from 1765 to 2300." Climatic Change (Special Issue), DOI: 10.1007/s10584-011-0156-z
Moss, R. H., J. A. Edmonds, K. A. Hibbard, M. R. Manning, S. K. Rose, D. P. van Vuuren, T. R. Carter, et al., 2010: The next generation of scenarios for climate change research and assessment. Nature, 463, 747-756.
von Salzen, K., N. A. McFarlane, and M. Lazare, 2005: The role of shallow convection in the water and energy cycles of the atmosphere, Clim. Dyn., 25, 671-688, doi: 10.1007/s00382-005-0051-2.