Generic RETRO software tools for emission data
sets
The RETRO emissions files are provided in the netcdf
data format. Several tools are available for converting and processing netcdf
files, and many data analysis programs support the netcdf file format. Below we
provide some tools for the three basic tasks (1) conversion to ascii files, (2)
checking the global totals of emissions, and (3) regridding the emissions data
files onto standard model grid resolutions. These tools were developed for three
different software packages, which are briefly described below. Not all tools
are available for all packages. If you create similar software using different
packages, if you find bugs, or if you improve these tools, pease send your code
to m.schultz_@_fz-juelich.de (remove underscores) so they can be placed on these
web pages.
- Climate Data
Operators (CDO)
- Netcdf Operators (NCO)
- IDL Routines
- Gridbox_area templates
back to RETRO emissions.
1. Climate Data
Operators (CDO)
CDO is a collection of command line Operators to manipulate and analyse
Climate Data (and other) files. Supported file formats are GRIB, netCDF,
SERVICE and EXTRA. There are more than 250 operators available. The home
page of CDO is
http://www.mpimet.mpg.de/~cdo/, and the tools are available under
the GNU public license. The CDO tool suite is optimized for fast
operations on whole files and can easily be installed on any Unix-type
platform. An introduction, a users' guide, and a reference card are
available on the CDO web pages. The description below works with CDO
versions 0.99 and higher.
1.1 Converting netcdf to ascii
cdo output RETRO_anthro_co_2000.0.5x0.5.nc >
out.txt
This creates an ascii file formatted as 6e11.5. Formatted output can be
obtained with
cdo outputf,format,nelem infile > outfile
Format is a C-style format string (i.e. "%11.5e"), and nelem specifies
the number of elements printed per line.
1.2 Checking global totals of emissions
This task requires a two-step process. First, we compute the global sum
of the emissions by weighing each grid cell's flux with the
corresponding gridbox area:
cdo fldsum -mul infile -selvar,gridbox_area
infile tmpfile
This gives the global totals in units of kg(species)/s. Note that infile
must be specified twice! In order to obtain the correct annual total, we
now need to multiply each month with the appropriate number of days (and
scale the values to obtain Tg/year):
cdo yearsum -muldpm mulc,86400.e-9 tmpfile
outfile
In order to view the result, you can use cdo info, cdo output, or ncdump
outfile.
1.3 Regridding
The CDO tools use the SCRIP library for grid transformations and they
can handle a large variety of different grids. Some standard grids are
predefined and can be accessed by name, other grids can be defined via
template files or via special grid definition files in text form. A
typical cdo command for regridding a RETRO emissions file onto a regular
grid with 128 longitude and 64 latitude boxes would be:
cdo remapcon,r128x64
RETRO_anthro_co_2000.0.5x0.5.nc out.nc
Make sure to use remapcon in order to preserve the total emission flux.
Note that the
gridbox_area variable will be replaced with new
contents after regridding (interpolating gridbox areas just doesn't make
sense). In order to interpolate onto a gaussian grid, use:
cdo remapcon,t42grid
RETRO_anthro_co_2000.0.5x0.5.nc out.nc
2. Netcdf Operators (NCO)
The netCDF Operators, or NCO, are a suite of programs known as
operators. Each operator is a standalone, command line program which
is executed at the UNIX shell-level like, e.g.,
ls or
mkdir.
The operators take netCDF or
HDF4
files as input, then perform a set of operations (e.g., deriving new
data, averaging, hyperslabbing, or metadata manipulation) and produce a
netCDF file as output. The operators are primarily designed to aid
manipulation and analysis of gridded scientific data. The NCOs are
developed as an Open Source project. The homepage at
http://nco.sourceforge.net
includes the program code and binaries, documentation, a mailing list,
and more. Compared to the CDO tool suite, the NCOs are somewhat slower
but perhaps more flexible.
2.1 Converting netcdf to ascii
ncks --print --format "%11.5e %11.5e %11.5e
%11.5e %11.5e %11.5e \n" RETRO_anthro_co_2000.0.5x0.5.nc > out.txt
This creates an ascii file formatted in 6 data columns (FORTRAN gurus
would write this as "6(e11.5,1x)"). Any C-style format specifications
can be given in the --format option.
2.2 Checking global totals of emissions
2.2.1 Approximate estimate
ncwa -N -w gridbox_area -a lon,lat,time
RETRO_anthro_co_2000.0.5x0.5.nc total.nc
will generate a new netcdf file total.nc containing the global total
emissions as kg/s. You can multiply the value obtained with
ncdump -v emission_flux total.nc
by a factor of 0.002592 in order to obtain an approximate value of the
annual global emission flux in Tg/year. This implies that all months are
weighted equally with 30 days per month, and the result will generally
be a few percent lower than the correct value (for the example file
RETRO_anthro_co_2000.0.5x0.5.nc the difference is about 2%).
2.2.2 More precise solution
In order to obtain a proper weighting of the lengths of the individual
months, more processing steps are necessary: First, compute the area
weighted total across longitudes and latitudes as
ncwa -N -w gridbox_area -a lon,lat
RETRO_anthro_co_2000.0.5x0.5.nc t1.nc
Then, download the
dayspermonth.nc
(or
dayspermonth_leapyear.nc)
file and merge this with the global totals file via
ncks -A -v dpm dayspermonth.nc t1.nc
Note that this will also change the information contained in the time
variable of the t1.nc file. Next, compute the weigthed average across
the
time variable. Finally, you can use ncap to obtain the result
in a sensible unit and ncatted to attach the unit to the
emission_flux
variable:
ncwa -N -w dpm -a time t1.nc t2.nc
ncap -s "emission_flux=emission_flux*86400./1.e9"
t2.nc total.nc
ncatted -a
units,emission_flux,c,c,"Tg(species)/year" total.nc
2.3 Regridding
There are no interpolation routines available in the NCO tool suite.
Please use either the CDO tools for this or the IDL routine ncregrid.pro
described below.
3. IDL Routines
IDL, the Interactive Data
Language, is a commercial software package for data analysis and
visualisation. A netcdf interface is built-in as well as many functions
and utility routines. Almost anything is possible with IDL, albeit some
things are rather cumbersome.
IDL routines are program files in ascii, which must be run in an IDL shell
environment. The tool package for the RETRO emissions files is included in the
idl_lib_mgs_042008.tgz
library package as part of the
retro folder.
The relevant individual routines are described below.
3.1 Converting netcdf to ascii
This can be accomplished with the program
ncdf_to_ascii.pro as follows:
IDL> ncdf_to_ascii, filename, level=level, variables=variables, outfile=outfile, $
time_offset=time_offset, startdate=startdate, enddate=enddate, $
time_format=time_format
;; time_format = 1 : add mstime for Excel
;; time_format = 2 : don't add mstime, but add hour
3.2 Checking global totals of emissions
The routine
retro_regtotals.pro computes global and reginal totals of various emission and
model output files (see retro_emissions_stat.pro for an example how to call this routine).
In order to compute total quantities in regions or country boundaries, you need the
region_mask or country_mask files in the correct grid resolution. These files can be found in
the
data directory of the mgs idl library.
3.3 Regridding
This can be accomplished with the
ncregrid.pro routine.
IDL> ncregrid, filename, oldgridname, newgridname, /add_area,
/conserve_mass
The available gridnames still need to be documented. You can see all
grid definitions in the IDL routine mgs_rgrid__define.pro (mgs_newobjects folder) in the
GetCatalogueEntry method or via
IDL> grid=obj_new('mgs_rgrid',/select)
IDL> obj_destroy, grid
Be sure to set the conserve_mass flag if you are regridding emission
data files. The add_area keyword will automatically insert a
gridbox_area variable in the resulting netcdf file, which is useful for
obtaining the global totals (see section 3.2).
4. Gridbox_area
templates
As described above, the grid_area variable will contain wrong values
after regridding with cdo remapcon. Since this variable is needed for
the computation of global totals with any of the tools, we provide some
template files with a grid_area variable here for a range of standard
model resolutions. These can be piped into a (regridded) emissions data
file using the NCO command ncks:
ncks -A gbXXX.nc
RETRO_anthro_co_2000.XXX.nc
where XXX stands for one of the following resolutions:
- T21 (gaussian grid associated
with spectral truncation T21; 64x32)
- T31 (gaussian grid associated
with spectral truncation T31; 96x48)
- T42 (gaussian grid associated
with spectral truncation T42; 128x64)
- T63 (gaussian grid associated
with spectral truncation T63; 192x96)
- T85 (gaussian grid associated
with spectral truncation T85; 256x128)
- T106 (gaussian grid associated
with spectral truncation T106; 320x160)
- T319 (gaussian grid associated
with spectral truncation T319; 640x320)
- r72x45 (regular grid with 72
longitude and 45 latitude coordinates; 5x4 degrees)
- r120x90 (regular grid with 120
longitude and 90 latitude coordinates; 3x2 degrees)
- r120x90 (regular grid with 144
longitude and 90 latitude coordinates; 2.5x2 degrees)
- r180x90 (regular grid with
180 longitude and 90 latitude coordinates; 2x2 degrees)
- r360x180 (regular grid grid
with 360 longitude and 180 latitude coordinates; 1x1 degrees)
- r720x360 (regular grid with
720 longitude and 360 latitude coordinates; 0.5x0.5 degrees)
- create_gridfile.pro IDL
routine to generate similar netcdf files for other resolutions
Note: new version of create_gridfiles.pro and new grid definition files
as of 23 April 2008. The previous files contained an erroneous description for spectral grids
and a faulty earth radius parameter.