Generic RETRO software tools for emission data sets

The RETRO emissions files are provided in the netcdf data format. Several tools are available for converting and processing netcdf files, and many data analysis programs support the netcdf file format. Below we provide some tools for the three basic tasks (1) conversion to ascii files, (2) checking the global totals of emissions, and (3) regridding the emissions data files onto standard model grid resolutions. These tools were developed for three different software packages, which are briefly described below. Not all tools are available for all packages. If you create similar software using different packages, if you find bugs, or if you improve these tools, pease send your code to m.schultz_@_fz-juelich.de (remove underscores) so they can be placed on these web pages.


  1. Climate Data Operators (CDO)
  2. Netcdf Operators (NCO)
  3. IDL Routines
  4. Gridbox_area templates
back to RETRO emissions.



1. Climate Data Operators (CDO)

CDO is a collection of command line Operators to manipulate and analyse Climate Data (and other) files. Supported file formats are GRIB, netCDF, SERVICE and EXTRA. There are more than 250 operators available. The home page of CDO is http://www.mpimet.mpg.de/~cdo/, and the tools are available under the GNU public license. The CDO tool suite is optimized for fast operations on whole files and can easily be installed on any Unix-type platform. An introduction, a users' guide, and a reference card are available on the CDO web pages. The description below works with CDO versions 0.99 and higher.

1.1 Converting netcdf to ascii

      cdo output RETRO_anthro_co_2000.0.5x0.5.nc > out.txt

This creates an ascii file formatted as 6e11.5. Formatted output can be obtained with

      cdo outputf,format,nelem  infile > outfile

Format is a C-style format string (i.e. "%11.5e"), and nelem specifies the number of elements printed per line.

1.2 Checking global totals of emissions

This task requires a two-step process. First, we compute the global sum of the emissions by weighing each grid cell's flux with the corresponding gridbox area:

      cdo fldsum -mul infile -selvar,gridbox_area infile  tmpfile

This gives the global totals in units of kg(species)/s. Note that infile must be specified twice! In order to obtain the correct annual total, we now need to multiply each month with the appropriate number of days (and scale the values to obtain Tg/year):

      cdo yearsum -muldpm  mulc,86400.e-9  tmpfile  outfile

In order to view the result, you can use cdo info, cdo output, or ncdump outfile.

1.3 Regridding

The CDO tools use the SCRIP library for grid transformations and they can handle a large variety of different grids. Some standard grids are predefined and can be accessed by name, other grids can be defined via template files or via special grid definition files in text form. A typical cdo command for regridding a RETRO emissions file onto a regular grid with 128 longitude and 64 latitude boxes would be:
        cdo remapcon,r128x64 RETRO_anthro_co_2000.0.5x0.5.nc out.nc
Make sure to use remapcon in order to preserve the total emission flux. Note that the gridbox_area variable will be replaced with new contents after regridding (interpolating gridbox areas just doesn't make sense). In order to interpolate onto a gaussian grid, use:
        cdo remapcon,t42grid RETRO_anthro_co_2000.0.5x0.5.nc out.nc


2. Netcdf Operators (NCO)

The netCDF Operators, or NCO, are a suite of programs known as operators. Each operator is a standalone, command line program which is executed at the UNIX shell-level like, e.g., ls or mkdir. The operators take netCDF or HDF4 files as input, then perform a set of operations (e.g., deriving new data, averaging, hyperslabbing, or metadata manipulation) and produce a netCDF file as output. The operators are primarily designed to aid manipulation and analysis of gridded scientific data. The NCOs are developed as an Open Source project. The homepage at http://nco.sourceforge.net includes the program code and binaries, documentation, a mailing list, and more. Compared to the CDO tool suite, the NCOs are somewhat slower but perhaps more flexible.

2.1 Converting netcdf to ascii

      ncks --print --format "%11.5e %11.5e %11.5e %11.5e %11.5e %11.5e \n" RETRO_anthro_co_2000.0.5x0.5.nc > out.txt

This creates an ascii file formatted in 6 data columns (FORTRAN gurus would write this as "6(e11.5,1x)"). Any C-style format specifications can be given in the --format option.

2.2 Checking global totals of emissions

2.2.1 Approximate estimate

       ncwa -N -w gridbox_area -a lon,lat,time RETRO_anthro_co_2000.0.5x0.5.nc total.nc
will generate a new netcdf file total.nc containing the global total emissions as kg/s. You can multiply the value obtained with
       ncdump -v emission_flux total.nc
by a factor of 0.002592 in order to obtain an approximate value of the annual global emission flux in Tg/year. This implies that all months are weighted equally with 30 days per month, and the result will generally be a few percent lower than the correct value (for the example file RETRO_anthro_co_2000.0.5x0.5.nc the difference is about 2%).

2.2.2 More precise solution

In order to obtain a proper weighting of the lengths of the individual months, more processing steps are necessary: First, compute the area weighted total across longitudes and latitudes as
      ncwa -N -w gridbox_area -a lon,lat RETRO_anthro_co_2000.0.5x0.5.nc t1.nc
Then, download the dayspermonth.nc (or dayspermonth_leapyear.nc) file and merge this with the global totals file via
      ncks -A -v dpm dayspermonth.nc t1.nc
Note that this will also change the information contained in the time variable of the t1.nc file. Next, compute the weigthed average across the time variable. Finally, you can use ncap to obtain the result in a sensible unit and ncatted to attach the unit to the emission_flux variable:
      ncwa -N -w dpm -a time t1.nc t2.nc
      ncap -s "emission_flux=emission_flux*86400./1.e9" t2.nc total.nc
      ncatted -a units,emission_flux,c,c,"Tg(species)/year" total.nc

2.3 Regridding

There are no interpolation routines available in the NCO tool suite. Please use either the CDO tools for this or the IDL routine ncregrid.pro described below.


3. IDL Routines

IDL, the Interactive Data Language, is a commercial software package for data analysis and visualisation. A netcdf interface is built-in as well as many functions and utility routines. Almost anything is possible with IDL, albeit some things are rather cumbersome. IDL routines are program files in ascii, which must be run in an IDL shell environment. The tool package for the RETRO emissions files is included in the idl_lib_mgs_042008.tgz library package as part of the retro folder. The relevant individual routines are described below.

3.1 Converting netcdf to ascii

This can be accomplished with the program ncdf_to_ascii.pro as follows:
IDL> ncdf_to_ascii, filename, level=level, variables=variables, outfile=outfile, $ time_offset=time_offset, startdate=startdate, enddate=enddate, $ time_format=time_format
;; time_format = 1 : add mstime for Excel
;; time_format = 2 : don't add mstime, but add hour

3.2 Checking global totals of emissions

The routine retro_regtotals.pro computes global and reginal totals of various emission and model output files (see retro_emissions_stat.pro for an example how to call this routine). In order to compute total quantities in regions or country boundaries, you need the region_mask or country_mask files in the correct grid resolution. These files can be found in the data directory of the mgs idl library.

3.3 Regridding

This can be accomplished with the ncregrid.pro routine.
IDL> ncregrid, filename, oldgridname, newgridname, /add_area, /conserve_mass
The available gridnames still need to be documented. You can see all grid definitions in the IDL routine mgs_rgrid__define.pro (mgs_newobjects folder) in the GetCatalogueEntry method or via
IDL> grid=obj_new('mgs_rgrid',/select)
IDL> obj_destroy, grid
Be sure to set the conserve_mass flag if you are regridding emission data files. The add_area keyword will automatically insert a gridbox_area variable in the resulting netcdf file, which is useful for obtaining the global totals (see section 3.2).

4. Gridbox_area templates

As described above, the grid_area variable will contain wrong values after regridding with cdo remapcon. Since this variable is needed for the computation of global totals with any of the tools, we provide some template files with a grid_area variable here for a range of standard model resolutions. These can be piped into a (regridded) emissions data file using the NCO command ncks:
        ncks -A gbXXX.nc RETRO_anthro_co_2000.XXX.nc
where XXX stands for one of the following resolutions:


Note: new version of create_gridfiles.pro and new grid definition files as of 23 April 2008. The previous files contained an erroneous description for spectral grids and a faulty earth radius parameter.