# NO₂ monitoring

TROPOMI (TROPOspheric Monitoring Instrument) (opens new window) is the name of a sensor on board of the Sentinel-5 Precursor (S5P) satellite (opens new window), developed to monitor atmospheric chemistry.

This use case analyses Sentinel 5P imagery, focusing in particular on NO₂ measurements. Compared to other variables measured by TROPOMI, NO₂ is of high interest not only because of its direct relation with environmental health, but also because the main sources are typically known, because it is not transported over long distances and because the total column values measured by TROPOMI are strong indication of the ground level values.

This document describes how to analyze NO₂ data on openEO Platform using the Python, JavaScript and R client. Additionally, we've prepared a basic Jupyter Notebook for Python and a much more advanced R Shiny app that you can run to analyze and visualize the NO₂ data in various ways.

# R Shiny app

In the R Shiny app the NO₂ values can be analysed globally over a full year. The analysis allows the user to set threshold values for the cloud cover. Gaps due to the cloud removal are filled by a linear interpolation. Noise gets removed by computing 30-day smoothed values, using kernel smoothing of the time series.

The R Shiny app allows for three different modes:

  • Time series analysis (min/max/smooted mean) and a comparison against locally measured data (for selected areas only)
  • Map visualization for individual days
  • Animated maps that highlight differences in space and time

Curently, the R shiny app can only run on your local computer but we are looking into offering a hosted version, too.

For more details please visit the homepage of the R Shiny app (opens new window).

# Basic NO₂ analysis in Python, R and JavaScript

openEO Platform has multiple collections that offer Sentinel-5P data, e.g. for NO₂:

SENTINEL_5P_L2 also contains additional data such as CO, O₂, SO₂ and you can also experiment with those. CO is also available on VITO in Collections such as TERRASCOPE_S5P_L3_CO_TD_V1.

In this example we'll use the daily composites from the TERRASCOPE_S5P_L3_NO2_TD_V1 collection, which is available starting from end of April 2018.

# 1. Load a data cube

Attention

This tutorial assumes you have completed the Getting Started guides and are connected and logged in to openEO Platform. Your connection object should be stored in a variable named connection.

First of all, we need to load the data into a datacube. We set the temporal extent to the year 2019 and choose spatial extent, here an area over Münster, Germany.

  • Python
  • JavaScript
  • R
year = 2019
extent = { # Münster
    "type": "Polygon",
    "coordinates": [[
        [7.737228350528245,51.86687168604513],
        [7.507741544165615,51.86687168604513],
        [7.507741544165615,52.05013100121914],
        [7.737228350528245,52.05013100121914],
        [7.737228350528245,51.86687168604513]
    ]]
}
datacube = connection.load_collection(
    "TERRASCOPE_S5P_L3_NO2_TD_V1",
    spatial_extent = extent,
    temporal_extent = [f"{year}-01-01", f"{year}-12-31"]
)

# 2. Fill gaps

The data cube may contain no-data values due to the removal of clouds in the pre-processing of the collection. We'll apply a linear interpolation along the temporal dimension:

  • Python
  • JavaScript
  • R
datacube = datacube.apply_dimension(dimension = "t", process = "array_interpolate_linear")

# 3. Smoothen values (optional)

If you want to smoothen the values to get rid of noise for example, we can run a moving average over a certain amount of days over the temporal dimension. If you want to work on the raw values, you can also omit this step. The moving_average_window variable specifies the smoothing in number of days. You can choose it freely, but it needs to be an odd integer >= 3. In the example below 31 was chosen to smooth the timeseries with a moving average of a full month.

  • Python
  • JavaScript
  • R
moving_average_window = 31
udf = openeo.UDF("""
from pandas import Series
import numpy as np

def apply_timeseries(series: Series, context: dict) -> Series:
    return np.convolve(series, np.ones({n})/{n}, mode='same')
""".format(n = moving_average_window))
datacube = datacube.apply_dimension(dimension = "t", process = udf)

Note

A technical detail here is that we run the moving average as a Python UDF, which is custom Python code. We don't have a pre-defined process in openEO yet that easily allows this and as such we fall back to a UDF. We embed the Python code in a string in this example so that you can easily copy the code, but ideally you'd store it in a file and load the UDF from there.

The Python UDF code (for all client languages!) itself is pretty simple:

from pandas import Series
import numpy as np

N = 31
def apply_timeseries(series: Series, context: dict) -> Series:
  return np.convolve(series, np.ones(N)/N, mode='same')

# 4. What do you want to know?

Now it's time to decide what we actually want to compute and get an insight into.

Currently, the data cube at still has 4 dimensions: The spatial dimensions x and y covering the extent of Münster, the temporal dimension t covering usually about 365 values of the given year, and the band dimension bands with just a single label NO2.

As the bands dimension only has a single label it gets dropped automatically during export, which means we now basically have a series of NO₂ maps, one for each day. We have several options now:

  1. Store the daily maps as netCDF file
  2. Reduce the temporal dimension to get a single map for the year as GeoTIff
  3. Reduce the spatial dimensions to get a timeseries as JSON

# 4.1 netCDF Export

To store daily maps in a netCDF file, we simply need to store the datacube:

  • Python
  • JavaScript
  • R
datacube = datacube.save_result(format = "netCDF")

# 4.2 Map for a year as GeoTiff

To get a map with values for a year we need to reduce the temporal dimensions by reducing the values along the temporal dimension. You can run different reducers such as mean (in the example below), max or min so that we get a single value for each pixel.

Lastly, you need to specify that you want to store the result as GeoTiff (GTiff due to the naming in GDAL).

  • Python
  • JavaScript
  • R
datacube = datacube.reduce_dimension(reducer = "mean", dimension = "t")
datacube = datacube.save_result(format = "GTiff")

# 4.3 Timeseries as JSON

To get a timeseries we need to reduce the spatial dimensions by aggregating them. You can run different aggregation methods such as mean (in the example below), max or min so that we get a single value for each timestep.

Lastly, you need to specify that you want to store the result as JSON.

  • Python
  • JavaScript
  • R
datacube = datacube.aggregate_spatial(geometries = extent, reducer = "mean")
datacube = datacube.save_result(format = "JSON")

# 5. Execute the process

Regardless of which of the options you chose in chapter 4, you can now send the process to the backend and compute the result. From simplicity, we simply execute it synchronously and store the result in memory. For the GeoTiff and netCDF files you may want to store them into files though. The JSON output you could directly work with.

  • Python
  • JavaScript
  • R
results = connection.execute(datacube)

# Result

If you'd visualize the results of running the timeseries analysis for mean, min and max could results in such a chart:

min/max/mean NO2 chart

Last Updated: 11/21/2022, 6:14:49 PM