The Lund University Modular Inversion Algorithm (LUMIA) is a python package for performing atmospheric transport inversions.

The release 2020.8 is described in https://www.geosci-model-dev-discuss.net/gmd-2019-227/ can be downloaded here, however, we recommend instead getting the latest commit from github:

git clone --branch master https://github.com/lumia-dev/lumia.git

This documentation focuses on the current version of the code.

Folder structure

The folder structure of LUMIA is the following:

  • the lumia folder contains the lumia python module (i.e. what you get when doing import lumia in python)
  • the transport folder contains the transport python module (that you can import using import transport in python), which contains the pseudo-transport model used in our LUMIA simulations (see documentation)
  • the docs folder contains a more extensive documentation
  • the run folder contains example scripts and configuration files.
  • the gridtools.py file is a standalone module (accessed via import gridtools)

The other files and folders are either related to optional functionalities (icosPortalAccess; src), required for the functionality of the LUMIA web page (CNAME, mkdocs.yml), or for the structure of the python package (setup.py, LICENSE). The Makefile is more for information purpose than for being used ...

lumia, LUMIA, transport and transport

The LUMIA git repository contains two python packages (lumia and transport). Furthermore, the lumia package contains several modules called transport. This can be very confusing. Therefore, throughout the documentation:

  • the transport package refers to the top-level tranport package (i.e. what is imported via import transport)
  • the transport module refers to the lumia.models.footprints.transport module.
  • "the lumia package" refers to lumia (what is accessed via import lumia)
  • LUMIA refers to the entire project.

Installation

LUMIA is written in python, and depends on many other scientific packages. We recommend using in a miniconda virtual environment, with at least the cartopy package installed:

# Create a conda environment for your LUMIA project (change `my_proj` by the name you want to give it)
conda create -n my_proj cartopy

# Activate the environment:
conda activate my_proj

# Clone lumia from its git repository (change `my_folder` by the name of the folder you want LUMIA to be installed in. The folder should not exist before)
git clone --branch master https://github.com/lumia-dev/lumia.git my_folder

# Install the LUMIA python library inside your virtual environment (replace `my_folder` by the name of the folder where you have cloned LUMIA in).
pip install -e my_folder

Dos and Don'ts

Testing the installation

Once you have installed lumia in your python environment (i.e. once you have gone through the installation instructions above), the lumia module will be accessible on your system. Try for example:

conda activate my_proj  # Make sure you are in the right python environment!
cd /tmp                 # Move to another folder, basically anywhere where your lumia files are not
python -m lumia         # Run python with the `lumia` module

This should produce an error such as:

/home/myself/miniconda3/envs/my_proj/bin/python: No module named lumia.__main__; 'lumia' is a package and cannot be directly executed

This is good! it means that python has found lumia (but doesn't know what to do with it, that's another issue).

If you get instead an error such as:

/usr/bin/python3: No module named lumia
This means that python cannot find the lumia module: either you are not within the right python environment (read the conda documentation if you are not familiar with it), or that another error happened during the installation of lumia (did you use the pip install -e /path/to/lumia as instructed above? did that return an error (maybe some dependency could not be installed?)).

Recommended workflow

The run folder contains example scripts and configuration files. You can use them as an example on how to start (in addition of reading the documentation that exists). You can also put your own scripts and data in that folder.

We can recommend a few alternative workflows, depending on what you plan to do with LUMIA:

  • You can put your scripts and data directly under the run directory (e.g. /home/myself/lumia/run). This is a good way if you are just starting with LUMIA, of if you plan to develop it further.
  • You can install lumia in one folder (e.g. /home/myself/libraries/lumia), and put your scripts in a completely different folder (e.g. /home/myself/projects/my_fancy_project): if you have installed it correctly, the lumia python package is available from anywhere on your system (provided that you have activated the python environment in which it is installed). This is a good way if you plan to work on LUMIA from several different projects.
  • If you plan to actively develop LUMIA, you could also have separate installation (e.g. /home/myself/lumia_stable and /home/myself/lumia_dev), each installed in a different python environement (e.g. my_old_env and my_new_env).

Documentation summary

LUMIA implements the inversion as a combination of (semi) independent modules, implementing the various components of an inversion setup. The links below point to the documentation of these modules:

Theoretical summary

LUMIA is (primarily) a library for performing atmospheric inversions: Observations of the atmospheric composition are used to improve a prior estimate of the emissions (or fluxes) of one (or several) atmospheric tracer (e.g CO\(_2\)).

The inversion relies on an atmospheric chemistry-transport model (CTM) to link fluxes of a tracer in and out of the atmosphere with observed atmospheric concentration. This can be formalized as

\[y + \varepsilon_{y} = H(x) + \varepsilon_{H}\]

where \(y\) is an ensemble of observations, \(H\) is a CTM, and \(x\) is a vector containing parameters controlling the CTM (i.e. x is the control vector). The uncertainty terms \(\varepsilon_{y}\) and \(\varepsilon_{H}\) represent respectively the measurement error and the model error.

The aim of the inversion is to determine the control vector \(x\) that leads to the optimal fit of the model \(H\) to the observations \(y\). This is formalized as finding the vector \(x\) that minimises the cost function

\[J(x) = \frac{1}{2}\left(x-x_b\right)^TB^{-1}\left(x-x_b\right) + \frac{1}{2}\left(H(x) - y\right)^TR^{-1}\left(H(x) - y\right)\]

where \(x_b\) a prior estimate of the control vector \(x\) (that we are trying to determine). The error covariance matrices \(B\) and \(R\) contain respectively the estimated uncertainties on \(x_b\) and \(y - H(x)\).

The optimal control vector \(\hat{x}\) is determined using an iterative conjugate gradient approach.

While the aim of the inversion is typically to estimate the emissions of a tracer, the control vector \(x\) rarely consists (directly) of the emissions: it can for instance contain scaling factor or offsets, at a lower resolution than the emission themselves, it may contain only a subset of the emissions, it can also include additional terms such as an estimate of the boundary condition, bias correction terms, etc.

\[ E = M(x) \]
  • lumia.observations implements the observation vector (\(y\)), the corresponding uncertainties (\(R\));
  • lumia.models handles the interface between LUMIA and the actual CTM (\(H\));
  • lumia.prior implements the prior control vector \(x_b\) and its uncertainty matrix \(B\);
  • lumia.optimizer implements the optimization algorithm;
  • lumia.data implements the emissions (\(E\)) themselves;
  • lumia.mapping implements the mapping operator (\(M\)), that is used to convert between model data and control vector.
  • lumia.utils contains various utilities, used by several of the other lumia modules. The Utilities page also describes the gridtools module, that is distributed along lumia.
  • transport acts as a CTM (\(H\)) in our LUMIA inversions.

Most of these modules can be used independently. For instance, the lumia.observations defines an Observation class that can be used to read, write and analyze observation files used in LUMIA (including LUMIA results).

Example scripts showing a full inversion are provided under the run directory, and presented in more details in the Tutorial.

Finally, some top-level objects (classes and functions) are described under the LUMIA overview page. The page also contains a more elaborate information on the coding strategy of LUMIA.

LUMIA Development strategy

LUMIA was designed as a modular system: it should remain as easy as possible to replace the default CTM and the individual components (prior, observations, optimizer, etc.) should remain as independent from each other as reasonable.

In practice this means that the following technical choices were made:

  • The LUMIA modules contain python classes to handle the different components of the inversions (e.g. an Observations class to handle the observations, a PriorConstraints class to construct and store the prior uncertainty matrix, etc.).
  • All modules implement or support alternative versions of these classes. For instance, the "Observations" class is defined in the lumia.observations.observations class (i.e. lumia/observations/observations.py file):
    • This approach makes space for alternative implementations (e.g. one could implement an Observations class part of a lumia.observations.satellite_obs module).
    • It can however make imports paths length (from lumia.observations.observations import Observations, from lumia.models.footprints.data import Data, etc.). To circumvent this, many "shortcuts" have been defined in the __init__.py files of the module (enabling e.g. from lumia.observations import Observations), and, even in the top-level __init__.py file (lumia/__init__.py). These are documented in the relevant sections (e.g. in Observations) and, for the top-level objects, in the section below.
    • In several of the modules, there is a protocol.py file, which essentially contain templates for the various classes. In theory, as long as alternative classes follow the template defined in the protocol.py files, they should not lead to major compatibility issues.
  • Although it was developed along LUMIA, the code that acts as a CTM (transport) remains a separate python package and is run as a subprocess: This forces us to limit the degree of inter-dependency between the two codes, and thus should make it easier implementing an alternative CTM.