A spatio-temporal analysis of NO_2 concentrations during the Italian 2020 COVID-19 lockdown

by   Guido Fioravanti, et al.

When a new environmental policy or a specific intervention is taken in order to improve air quality, it is paramount to assess and quantify - in space and time - the effectiveness of the adopted strategy. The lockdown measures taken worldwide in 2020 to reduce the spread of the SARS-CoV- 2 virus can be envisioned as a policy intervention with an indirect effect on air quality. In this paper we propose a statistical spatio-temporal model as a tool for intervention analysis, able to take into account the effect of weather and other confounding factors, as well as the spatial and temporal correlation existing in the data. In particular, we focus here on the 2019/2020 relative change in nitrogen dioxide (NO_2) concentrations in the north of Italy, for the period of March and April during which the lockdown measure was in force. As an output, we provide a collection of weekly continuous maps, describing the spatial pattern of the NO_2 2019/2020 relative changes. We found that during March and April 2020 most of the studied area is characterized by negative relative changes (median values around -25 week of March and the fourth week of April (median values around 5 changes cannot be attributed to a weather effect, it is likely that they are a byproduct of the lockdown measures.



There are no comments yet.


page 19

page 25

page 26


Spatio-temporal models with space-time interaction and their applications to air pollution data

It is of utmost importance to have a clear understanding of the status o...

A hierarchical spatio-temporal model to analyze relative risk variations of COVID-19: a focus on Spain, Italy and Germany

The novel coronavirus disease (COVID-19) has spread rapidly across the w...

Statistical assessment of spatio-temporal impact of lockdown on air pollution using different modelling approaches in India

One of the main contributors to air pollution is particulate matter (PMx...

A general theory for preferential sampling in environmental networks

This paper presents a general model framework for detecting the preferen...

Isometric force pillow: using air pressure to quantify involuntary finger flexion in the presence of hypertonia

Survivors of central nervous system injury commonly present with spastic...

Considerations for developing predictive models of crime and new methods for measuring their accuracy

Developing spatio-temporal crime prediction models, and to a lesser exte...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the last decades a large number of initiatives have been taken in European countries in order to reduce air pollution and its adverse effects on human health. This is for example the case of low emission zones or congestion charge introduced in urban areas [see e.g., Fasso2013, HOLMAN2015, Maranzano2020], of more stringent limits on the content of sulfur in marine fuels [e.g., GRANGE2019] or of new air pollution control regulations [e.g., FONT2016, HAN2021], among others [see BURNS2020, for a review]. A crucial issue in air quality management is the assessment of the efficacy of a specific intervention or a new environmental policy. In particular, quantifying the effective changes in pollutant concentrations due to the adopted strategy can be difficult because of the complexity of air pollution dynamics, strongly depending on weather conditions. Moreover, when the intervention is time-delimited and characterizes different areas, it may be interesting to evaluate the variability in space and time of its effect, if any.

The lockdown measures adopted by many countries in 2020 in order to to prevent the spread of the SARS-CoV-2 virus can be considered as a policy intervention, with a possible indirect effect on air quality. In this regard, the main focus is on assessing how the lockdown affected air pollution levels, in particular after controlling for meteorology, long term trend and other confounding factors. The literature on this topic is obviously quite recent but already large, as discussed in the two recent systematic reviews by Gkatzelis2021 and Rana2021.

In this paper, we propose a new statistical modeling approach for assessing and quantifying the effectiveness of a policy intervention on air quality. In particular, our statistical model is defined for daily differences of pollutant concentrations and has a spatio-temporal specification which gives us the possibility to estimate in space and time the relative change in air pollution levels. We show here an application of our modelling strategy for nitrogen dioxide (NO

) concentrations in northern Italy during the months of the first Italian lockdown (March and April 2020) as compared to 2019. In early 2020, Italy was the first European country to adopt strict lockdown measures [Remuzzi2020]. Starting from late February 2020, people were banned from travelling and all the non essential socio-economic activities were stopped. These restrictive measures were initially adopted in some limited municipalities in northern Italy and then were extended to the entire country with the national lockdown in force since March 8, 2020 [Malpede2020, Sanfelici2020]. The restrictions caused, incidentally, a strong decrease in anthropogenic emissions of the main air pollutants, especially for some sectors such as road transport and aviation.

Several environmental studies revealed that restrictions on mobility during the lockdown had a positive effect on NO levels, even if not homogeneously across the considered spatial domains. For example, the broadly-publicized data from the Copernicus Sentinel-5P satellite [Dutta2021, Muhammad2020, Bar2021] recorded for NO a sharp drop, in the range 20–55%, during January – April 2020 compared to 2019 in many cities in China, India, Western Europe and United States. With regards to Italy, Bauwens2020 found the average TROPOMI NO column during the lockdown period in 2020 to be between 38 ( 10%) lower than during the same period in 2019 in Milan. Similarly, Cersosimo2020 reported a general decrease in NO levels over the Po Valley during the lockdown using both satellite and in situ measurements. For the city of Rome and its surroundings, Bassani2021 documented a 2019 vs 2020 reduction of 50% in NO concentrations using surface measurements from urban traffic stations. The NO decrease for the city of Rome was also documented in Kumari2020. Finally, a general decrease in the NO levels for three cities in Tuscany region was described by Donzelli2020, for Reggio Emilia by Marinello2021 and for Naples by Sannino2021. The main weakness of the aforementioned studies consists in their descriptive approach, based on the direct comparison of pollutant concentrations before and after the lockdown or between the lockdown period and the corresponding period of the previous year(s). In other words, these studies make no attempt to adjust for the effect of meteorological conditions which can be adverse or favorable to pollutant dispersion. In this regard, it is worth to note that the first quarter of 2020 experienced exceptional weather conditions with also stronger positive temperatures anomalies over Europe [Barre2020, von2021].

A number of studies approached the problem of assessing the lockdown effects on air quality using chemical transport models (CTM) in order to compare model forecasts under the business-as-usual (BAU) emission scenario with the observed ground-level measurements or with the expected concentrations computed under a COVID-19 pandemic scenario with reduced emissions [see e.g., Barre2020, Menut2020, Piccoli2020, Putaud2020, WANG2021]. In any case, the use of deterministic models poses a number of practical and conceptual difficulties. First, collecting the needed input data (e.g., emission inventories and meteorology data) is far from straightforward. Second, deterministic models are complex to be run and are not able to properly assess the uncertainty of the results.

Another substantial research line is represented by model-based studies. In this context historical measurement data from previous years (or from the pre-lockdown period) are used to run machine learning algorithms

[see e.g., Barre2020, Diemoz2021, Keller2021, KIM2021, Grange2021, Petetin2020, Granella2021]

or to estimate statistical models, as multiple linear regression models

[e.g., BAO2020, Dacre2020, HOERMANN2021], Generalized Additive Models [e.g., EEA:2020, ORDONEZ2020, Solberg2021, Sverre2021] or Autoregressive Integrated Moving Average models [e.g., Tyagi2020]. The fitted model is then used to predict concentrations for 2020 (or for the post-lockdown period) under the BAU (or counterfactual) scenario, i.e., assuming that the lockdown did not take place. In order to correct for the weather effect and temporal trends, meteorological and time variables are included in the model as linear or non-linear effects. The differences between the predictions, derived from the estimated model, and the observed concentrations (i.e., the out-of-sample prediction errors) are then used to evaluate the effect of the lockdown restrictions. Other papers adopt a different modeling approach and define the counterfactual by using data from cities not subject to lockdown restrictions; then difference-in-differences (DID) models are used to estimate the relative change in pollutant concentrations in the treatment group (locked-down cities) compared with the control group (non-locked-down cities) [see e.g., Guojun2020, WangM2021]. An extension of the DID method is proposed in Zheng2021

with the aim of computing the would-be average concentrations without the COVID-19 pandemic by removing the meteorological confounding and accounting for the temporal trend. Another modeling strategy makes use of time series model or dynamic panel data model, where pollutant concentrations, measured in 2020 and possibly in previous years, represent the response variable. The lockdown effect is included as a time-dependent dummy variable in the set of regressors, together with meteorological and time variables

[see e.g., BAO2020, BELOCONI2021, Cameletti2020]. In this case the effectiveness of the lockdown is evaluated by means of the lockdown dummy coefficient and its interaction with time or other regressors.

Whatever the adopted model-based strategy is, pollutant concentrations time series can be analysed separately for each monitoring station or jointly. The second solution leads to more efficient parameter estimates and a better predictive capability because of the larger amount of available data. More importantly, when measurements come from several spatial sites it is convenient to account also for spatial correlation, besides the temporal one, for explaining any residual variability. This is a standard and well-established option in air pollution modeling [see e.g., Finazzi2013, sahu2006, Lee2016] given that nearby monitoring stations are expected to show similar pollutant concentrations values. As far as we know, BELOCONI2021 is the only study which implements a spatio-temporal model for evaluating the effect of COVID-19 lockdown on air quality, while all the other papers fail to consider the spatial correlation. However, in BELOCONI2021 the analysis of the impact of the lockdown is limited at the single point stations and no continuous maps are provided for the entire region of interest.

In this paper, using a spatio-temporal statistical model, we aim to produce spatio-temporal maps showing continuously in space and across time the impact of the lockdown on air pollution. We expect these highly spatially resolved maps to help in assessing if the lockdown effect was homogeneous in the considered area or was more consistent in particular zones. Having this goal in mind, we modeled the 2019/2020 daily differences of NO concentrations (in March and April), rather than jointly modeling the data available for the two years. This makes it possible to focus on the change occurred between 2019 and 2020 in the NO levels, while still accounting for the effect of meteorology. In particular, by including a spatial stochastic component, our model is able to take care of the spatial correlation between observations and generate spatially continuous prediction surfaces, also where no ground-monitoring stations are available and in remote or mountainous areas [Diemoz2021]. The model we propose is applied here to the Italian COVID-19 case study but it is a general modeling approach that can be implemented anytime it is necessary to evaluate the effect of an intervention on an output variable with a spatio-temporal dimension.

The paper is structured as follows: in Section 2 we introduce the study domain and the input data. Then, in Section 3 the proposed spatio-temporal regression model is described; details are also provided with respect to the prediction phase of the analysis. Finally, in Section 4 we describe the main results of our analysis with particular attention to the prediction maps. Section 5 concludes the paper.

2 Data material

2.1 Monitoring sites and air pollution data

In this study we focused on the Italian COVID-19 lockdown period (from March to 30 April , 2020), corresponding to 10 weeks of daily observations. The 2019 and 2020 raw NO hourly data (expressed as ), together with monitoring stations metadata, were extracted from the national database used to store and process the Italian Air quality measurements [SNPA2020]. The hourly data were measured at stations operated by the local Regional Environmental Protection Agencies, following the European standards EN 14211:2012 for NO, NO, and NO [CEN-TC264]. All ground concentrations were fully validated following internal quality assurance and quality control standard procedures.

Daily averages were computed provided that a daily 75% data coverages was achieved (i.e., at least 18 valid hourly records out of 24). The input dataset for our analysis regards 200 monitoring sites with a low proportion of missing data ( per station) and distributed across 8 regions or autonomous provinces in the north of Italy (Valle d’Aosta 4 stations, Piedmont 17, Veneto 32, Lombardy 55, Autonomous province of Trento 5, Friuli Venezia Giulia 12, Autonomous province of Bozen 5, Emilia Romagna 36) plus Tuscany region in the centre of Italy (with 34 stations). The spatial distribution of the selected stations is illustrated in Figure 1. The monitoring stations cover urban (123 stations), suburban (39) and rural (38) areas. At the time of the analysis, we were not able to access to the NO data from Liguria region (the western region filled in gray in the map in Figure 1).

Figure 1: Spatial distribution of the monitoring sites for concentrations (red points) and mesh adopted for the implementation of the spatio-temporal model described in Section 3.

The investigated area is characterized: in the north, by remote and relatively pristine areas with the high peaks of the Alps mountain system; in the southern part, by the mountains and hills of the Apennines range (the ”spine” of the Italian peninsula); in the centre, by the Po Valley. The latter is a large area which includes Lombardy, Emilia-Romagna, Veneto and Piedmont, regions which account for 48.2% of Italy’s GDP [OECD2020]. Interestingly, in early March 2020, these were the regions with the highest incidence of COVID-19 cases111https://www.salute.gov.it/imgs/C_17_pagineAree_5351_0_file.pdf. The Po Valley exhibits intense human pressure and severe pollution levels, with high population density, urban sprawl, intensive agricultural practice and livestock management [Pezzagno2020, Romano2020]. This leads to large amounts of NO emissions from vehicles, methane and ammonia from agricultural activities and particulate matter (PM) from residential heating. The Alps and the Apennines often limit the air flows between the Po valley and the rest of continental Europe causing air pollution stagnation.

Table 1 provides the summary statistics for the 2019 and 2020 NO daily concentrations. A certain variability in the NO values is observed across areas, month and year. Nonetheless, it is apparent that lower NO measurements characterize the 2020 months compared with 2019. For example, the monthly NO mean values are in the range 13 - 35 in March 2019 and 12 - 26 in April 2019, while in March and April 2020 the same values vary between 11 - 24 and 7 - 17 , respectively. This means that for the lockdown months of 2020 the range of the monthly mean concentrations is approximately halved compared with the same months of the previous year. The same situation can be observed also in the median and maximum values.

Area 2019 2020
Mean SD Min Median Max Mean SD Min Median Max
March Bolzano (AP) 34.7 11.5 11 33 72 24.2 11.6 4 22 55
Trento (AP) 30.7 17.8 3 30 92 19.6 12.5 3 18 61
Emilia Romagna 24.8 16.0 ¡1 22 85 16.3 10.6 ¡1 14 57
Friuli Venezia Giulia 21.1 9.9 3 20 53 14.2 9.1 ¡1 13 42
Lombardy 32.1 15.5 2 30 99 22 11.6 2 20 85
Piedmont 27.5 16.3 3 24 92 17.8 9.1 3 16 45
Tuscany 21.6 13.9 ¡1 20 80 15.5 10.7 ¡1 14 62
Valle d’Aosta 13.0 8.5 1 11.5 33 11.1 9.1 1 10.0 50
Veneto 27.1 12.6 3 27 68 17.8 10.1 2 16 49
April Bolzano (AP) 26.2 9.2 7 25 50 16.8 4.8 6 17 29
Trento (AP) 22.7 13.7 3 21.5 58 13.5 6.2 4 13 33
Emilia Romagna 19.3 13.2 ¡1 16 72 10.9 6.6 ¡1 9 40
Friuli Venezia Giulia 14.2 7.8 1 13 37 9.7 4 1 9 25
Lombardy 23.0 13.6 1 20 97 15.6 7.8 1 14 58
Piedmont 19.8 10 3 18 73 11.9 6.1 1 11 31
Tuscany 19.7 13.3 ¡1 17 83 10.1 6.2 1 9 33
Valle d’Aosta 12.0 7.9 ¡1 11 32 6.8 3.5 1 6 14
Veneto 18.3 11.2 2 16 62 11.6 6.3 2 10 35
Table 1:

Summary statistics - mean, standard deviation (SD), minimum (Min), median, and maximum (Max) - for NO

concentrations (in ), for the considered region or autonomous province (AP) and for March and April of years 2019 and 2020.

2.1.1 Daily differences

To match the 2019 and 2020 daily NO concentrations and compute the corresponding daily differences, we excluded the use of the calendar date. The reason for this choice is that we don’t want to match business days in 2020 with weekend days in 2019, and vice versa, given the well-known NO weekly cycle (characterized by lower values during the weekend). Rather, we aligned the 2019 daily measurements of each monitoring site by week number (according to the ISO-8601 standard) and day of the week (Monday, Tuesday, …, Sunday) taking as a reference those of the 2020 daily measurements. A similar approach is described in Ruan2020 to analyze the impact of COVID-19 on electricity consumption in the US. By doing so, we were able to compare the first Monday of the first week in March 2020 with the first Monday of the first week in March 2019, the first Tuesday of the first week in March 2020 with the first Tuesday of the first week in March 2019, and so on. However, even using this approach it can happen that a public national holiday (e.g., Easter Monday or April 25) in 2020 is matched with a working day in 2019, and vice versa.

The parallel plot in Figure 2 graphically shows the 2019 and 2020 NO aligned datasets. Here, the daily data are visually aggregated at the weekly level. The plot nicely reveals the decrease in the NO concentrations during March and April 2020, compared with 2019, especially during the last two weeks of March. At the same time, the positive increments occurring across the weeks in some of the stations suggest that it would be wrong to think that, if the lockdown affected the NO concentrations levels, such an effect was a homogeneous phenomenon both in time and space.

We conclude this section observing that a small fraction (about ) of aligned daily measurements exhibits unusual combinations of extremely high values in one year and extremely low values in the other. These combinations correspond to 2019/2020 relative changes greater than 100% in absolute value, which abnormally lie outside the general pattern seen in our input dataset. These data have no meaning for the purposes of our analysis and were discarded.

Figure 2: Aligned 2019-2020 daily NO concentrations during the weeks of March and April. Each line represents a couple of aligned data of a monitoring site, distinguishing between decrements (blue lines) and increments (green lines) in NO levels. The different government decrees and rules of the Italian lockdown are reported as text in the boxes. Each weekly plot depicts 1400 lines (200 monitoring sites times 7 days). Note that after the alignment process the first week of March contains just one day (Sunday, 200 lines) and the last week of April only two days (Monday and Thursday, 400 lines).

2.2 Spatial and spatio-temporal regressors

For each monitoring site, we retrieved its meteorological conditions and spatial characteristics for a total number of 13 independent explanatory variables (see Table 2), whose selection was driven by expert knowledge [FIORAVANTI2021].

To describe part of the spatial variation of the NO daily differences, three geographical variables (constant in time) were considered: the linear distance from the the nearest major road, the altitude and the percentage of agricultural and arable lands. YEGANEH2018 found that traffic volume and congestion data for all of the individual roads can effectively improve the spatio-temporal modelling of NO concentrations. Unfortunately, this information is not available for the whole investigated domain.

How weather affects NO concentrations is assessed using 10 variables (such as total precipitation, wind speed, relative humidity, surface pressure) retrieved from the ERA5-Land dataset from the Copernicus Climate Data Store (https://climate.copernicus.eu/). ERA5-Land is a regridded version (9 Km grid spacing) of the ERA5 climate reanalysis [Hersbach_et_al_2020] of the European Centre for Medium-Range Weather Forecasts (ECMWF). The use of ERA5 data is documented, among others, by Chan2021 for investigating the effect of COVID-19 pandemic on NO concentrations over Germany.

At this point it is important to stress that the meteorological variables enter the model as daily differences. More precisely, for the meteorological daily data we applied the same alignment procedure used for the NO concentrations and described in Section 2.1.1. Afterwards, we calculated the daily differences which we used as model regressors, as explained in Section 3, where the details of the statistical model are given. It is worth to note that the inclusion in the model of the differenced meteorological variables alleviates potential multicollinearity problems and makes it possible to include in the same model regressors like temperature and surface pressure, which otherwise would be highly correlated.

Data Source Description Unit Spatial Resolution
Minimum Planet Boundary Layer Km Km
Maximum Planet Boundary Layer Km
Total Precipitation mm
Surface Pressure hPa
Average Temperature at 2 meters C
Wind Speed m/s
Previous day Wind Speed m/s
Relative Humidity %
Net Irradiance
Diurnal Temperature Range C
Global Multi-resolution Terrain Elevation Data
Altitude m  1 Km
Linear distance to the nearest primary road m 1 km
Corine Land Cover
Agricultural and arable lands % 1 km
Table 2: Spatial and spatio-temporal predictors included in the model.

3 Bayesian spatio-temporal model

Consider a couple of NO concentrations and temporally aligned according to the procedure described in Section 2.1.1, where denotes the location (with ) and the day () of month (where denotes March) of year 2019 and 2020, respectively.

As the objective of this study is to evaluate the change in NO levels between 2019 and 2020, we first of all defined the daily differences of the log-transformed NO concentrations:


The logarithmic transformation is a common choice in air pollution analysis in order to reduce the typical positive skewness observed in concentrations distributions


As the Italian lockdown fell almost at the end of the winter season, when usually the meteorological conditions favor the dispersion of the pollutants, a downward trend in NO concentrations is expected across March and April. This could have a confounding effect when trying to isolate the impact of COVID-19 lockdown measures on air quality. In order to control for this long-term trend, we decided to model the daily differences separately for the two months (March and April).

The measurement equation is given by


where is the Gaussian measurement error assumed to be independent in space and time. For the sake of simplicity in Equation (2) we omitted the index given that the model structure is the same for the considered months.

The mean level is then defined as the sum of fixed and random effects as follows:


where is the intercept and the linear temporal trend which should account for the short term variation across days in the month. The term is the

-dimensional vector of the purely spatial regressors, while

is the -dimensional vector of the differences computed using the values of the meteorological regressors, as described in Section 2.2. We assume a linear effect for the spatial and spatio-temporal covariates by means of the parameters’ vector and , respectively. In order to test whether or not there was a variation in the Sunday effect during the lockdown restrictions, we included also the dummy variable which is equal to 1 when is Sunday and zero otherwise. The corresponding coefficient represents the additional expected change in the mean difference during Sunday. The term represents a temporally and spatially uncorrelated Gaussian random effect which captures some of the small scale spatial variability. Finally, is the residual space-time correlation for which the following first order autoregressive dynamics was specified:

for and . The innovations

have a zero-mean Gaussian distribution, are uncorrelated in time (i.e.,

if ) while being spatially correlated, i.e., , where is the Euclidean distance between site and and

is the Matérn covariance function with variance

and range . For more details about this separable spatio-temporal covariance structure see for example Cameletti:2011, Cameletti2013.

3.1 Prior specification and model implementation

The model described in Section 3 is developed in the Bayesian framework and is fully specified once prior distributions are set.

Vague Gaussian priors were used for , and for the elements of and . For the remaining parameters Penalized Complexity (PC) priors [simpson2017] were used. PC priors are a relatively new approach to specify weakly informative priors [Lemoine2019] in realistically complex statistical models with the twofold purpose of penalizing model complexity and avoiding overfitting.

For the standard deviation parameters (here and ) PC priors are generally defined as , where

is a quantile of the prior and

is a probability value. In our analysis we set

and for both and . This choice can be motivated considering that the total standard deviation of the observed daily differences of the log-transformed NO concentrations is in each month, so it is very likely that the variance of each component is lower than 1.

The joint PC prior suggested in Fuglstad2019 was used for and . This can be specified through

where we set , , , . Finally, for the autocorrelation parameter we used the PC prior proposed in Sorbye_ar1. This can be specified through , where we set and . All these choices reflect both previous findings [see e.g. Cameletti2013, FIORAVANTI2021] and restrictions to the possible values of and .

Inference was carried out by using the INLA-SPDE approach [Martino2020, Lindgren2015]

, which has been proved to be computationally faster than the Markov chain Monte Carlo (MCMC) approach commonly used for Bayesian inference.

3.2 Prediction and posterior summary statistics

Once the model has been fitted to the observed data, we used Monte Carlo (MC) simulation to generate a large number of samples (say 1000) from the posterior predictive distribution of

for any location in the study area and day :


where is the vector of all the model parameters, whose uncertainty is averaged out given all the observed data . For simplicity the generic sampled value will be denoted by .

To generate prediction maps, we used a raster grid with a spatial resolution of 1 km. For each pixel of this grid, corresponding to the generic location , we simulated 1000 values from Equation (4). While computing the predictions, we set the values of equal to zero for each and . This corresponds to assume that the meteorological conditions are equal in 2019 and 2020 for each location and time point.

We derived the posterior distribution of the relative change of NO concentrations between 2019 and 2020 by using the following transformation:


where takes negative (positive) values if lower (higher) NO concentrations are expected in 2020 compared to 2019, and equal to zero in case of no change.

Finally, we averaged the 1000 MC samples from the daily distributions of

at the week temporal resolution. The result is a collection of 1000 predicted raster maps for each week, from which it was possible to obtain the maps of the posterior mean and of the 2.5% and 97.5% quantiles. These latter two identify the bulk of the posterior distribution of each grid cell and were used to determine the 95% credible interval. When a credible interval does not include the zero, the corresponding relative change statistically differs from zero at the significance level of 0.05.

Finally, it is noteworthy to say that to take into account the weekly cycle of NO concentrations, we generated both mean and quantile maps distinguishing between working days (Monday - Saturday) and weekends (Sunday).

3.3 Implementation and data availability

R software was used for the model implementation by means of the R-INLA package (https://www.r-inla.org). For the manipulation of the raster maps, we used the R raster package (https://cran.r-project.org/web/packages/raster/index.html) and the CDO software (https://code.mpimet.mpg.de/projects/cdo). Input data and part of the code used for this study are available through a dedicated GITHUB repository (https://github.com/progettopulvirus/A_spatiotemporal_analysis_of_NO2_concentrations).

4 Results and discussion

4.1 Model parameters

Figure 3: Posterior distribution of (Intercept), (Day), (Sunday) and of the covariate coefficients and . When the bulk of the posterior distribution lies far from zero it can be concluded that the corresponding parameter is significantly different from zero.

Figure 3 shows the monthly posterior distributions for the fixed effects’ coefficients (, , , and ). Generally speaking, we observe that a significant effect, invariant in sign across the months of March and April, characterizes most of the selected regressors. The shape of the posterior distributions is rather stable in the models for the two months, but some exceptions are apparent. The distribution of the linear trend coefficient is narrower in March than in April, while the opposite happens for the posterior distribution of the intercept . This is in line with the large variability which characterizes the weekly prediction maps of April (see Section 4.3). Finally, the surface pressure coefficient exhibits two almost flat posterior distributions, but this result is not of easy interpretation.

A significant negative linear trend coefficient was found in March, which suggests a decreasing trend for the daily differences of the log-transformed NO concentrations in the month when the lockdown restrictions occurred. A significant negative effect was also found for the Sunday coefficient . We can interpret this result saying that a weekly cycle persists even when we consider the difference of the log-transformed NO concentrations. Surprisingly, neither the linear trend nor the Sunday dummy variable coefficients are significant in April.

With regards to the meteorological parameters, we distinguish those with a positive significant effect (the diurnal temperature range, the average temperature at 2 meters, the relative humidity and surface pressure) and those with a negative significant effect (the min and max planet boundary layer, the net irradiance, the wind speed and the total precipitation). Conversely, all the spatial regressors (elevation, % of agricultural/arable land and distance to major roads) show a positive posterior mean. This could suggest that far from the urbanized centers and the road network the level of the NO daily concentrations in 2020 tends to be equal or greater than the ones in 2019.

Table 3 provides the posterior summary statistics for the remaining model parameters. We observe that the posterior mean of the AR(1) autocorrelation coefficient , the spatial range and the standard deviation have a larger posterior mean in April than in March.

March 0.64 (0.023) 74 (4.0) 0.16 (0.011) 0.21 (0.004) 0.37 (0.013)
April 0.80 (0.021) 97 (5.5) 0.22 (0.015) 0.22 (0.003) 0.42 (0.021)
Table 3: Posterior mean and standard deviation (in parentheses) of the parameters in the models for March and April. : AR(1) coefficient; and : range (in km) and standard deviation of the Matérn spatial covariance function; : standard deviation of the small scale spatial random effect; : standard deviation of the Gaussian measurement error.

4.2 Model validation

(a) Training stage - March
(b) Training stage - April
(c) Validation stage - March
(d) Validation stage - April
Figure 8: Agreement between modelled and measured relative change () in the training (top) and validation stage (bottom). The solid red line is the 1:1 line as a reference.

For validation purposes, we stratified the input monitoring sites according to their type classification (urban, suburban and rural stations). Within each of these groups, we sampled 10% of the stations in order to define a validation dataset. The remainder of the stations (training dataset) was used to fit the model and the fitted model was used to predict the daily differences of the log-transformed NO concentrations (see Equation (1)) on the validation dataset. Both the sampling and the estimation process were repeated three times for each month. Comparing predicted and observed values allows to investigate how the model performes on unseen data, specifically if the model generalizes well or suffers from overfitting/underfitting.

Plots and summary statistics are shown as percentage relative changes (see Equation (5)) for ease of interpretation of the validation results. The scatterplots in Figure 8 shows the distribution of the fitted values versus the observed values. The points spread uniformly along the diagonal line, showing good agreement between observed and modelled data both in the training and validation stage. As expected, modelled relative change for the validation stage shows greater variability than in the training stage. Also the Pearson correlation coefficients support the good model performance with a value of 0.9 for the training stage and of for the validation stage. Finally, the Root Mean Squared Error is and for the training and validation stage, respectively.

4.3 No 2019/2020 relative change maps

Using the predictors described in Section 2.2 as individual raster files, the procedure described in Section 3.2 was implemented to obtain maps of the relative change of NO concentrations between 2019 and 2020, separately for the weeks of March and April.

The maps in Figure 13 and 19 show the spatial pattern of the 2019/2020 relative change of NO concentration in the north/centre of Italy for March and April, respectively. These maps refer to the weekly averaged estimates obtained from working days (Monday - Saturday), while the weekly Sunday estimates are available in the Annex (see Section 6). This distinction between working days and Sunday maps reflects our choice of including a dummy regressor for the Sunday effect in the model and its statistical significance observed in March. For visualization purposes, we limited the data range between -100% and +100% and used a scientifically derived color map [Crameri2020]. Furthermore, the islands from the Tuscan Archipelago are not displayed and Liguria region, where no daily NO concentrations were available, is represented with a white background.

It is worthwhile to stress that the maps presented in this section must be intended as meteorology-normalized maps. This means that they were generated assuming that, in each cell of the raster grid, the daily meteorological conditions are exactly the same in 2019 and 2020. Mathematically, this assumption is equivalent to setting to zero all the meteorological terms in our spatio-temporal model (see Section 3.2).

Figure 13 and 19 reveal a substantial decrease in NO concentrations during March and April 2020 as compared to 2019. A statistically significant reduction persists during the third and fourth working weeks of March across the whole study area: we quantified the interquartile range of the corresponding relative changes distribution to be between -40% and -20%, as shown in the boxplot in the left panel of Fig. 22. Notably, the third and fourth week of March correspond to the stringent phase of the COVID-19 lockdown. The barplots of Figure 23 indicate that Lombardy, Veneto, Friuli Venezia Giulia and Emilia Romagna are the regions where significant reductions in the NO concentrations mostly occurred. Conversely, Tuscany region in the centre and the mountainous regions of Valle d’Aosta and the Autonomus Provinces of Bolzano and Trento are those where most of the estimated changes have no statistical significance.

In April the predicted surfaces show more spatial variability than in March and all the weekly distributions of the relative changes (right panel of Fig. 22) exhibit positive increments up to 100%. These results suggest that the NO concentrations began to recover in April. Surprisingly, the fourth week of April is overwhelmed by a significant increment in the agricultural/arable area of the Po Valley (median increment around 10%), while a spatially homogeneous decrease, like the one observed in the second half of March, occurs only during the third week of the month (interquartile range between -40% and -20%). Note that during most of April 2020 the lockdown restrictions were the same of the third and fourth week of March. Looking at the maps of Figure 19, it is apparent that the urbanized belt of the Po valley shows a persistent significant negative change during almost all the weeks of April. This is confirmed by the barplots of Figure 23. Despite this variegated situation, the weekly distributions of the relative changes still are dominated by negative values with median values around -30%, with the exception of the fourth week.

Generally speaking, the estimates from the Sunday maps (see Section 6) are consistent with those obtained for working days. The most pronounced decline of the NO concentrations occurs during the last two Sundays of March: we find a -40% median relative change in 2020 as compared to the same period in 2019. The same drop in the NO concentrations is quantified for the second week of April (when Easter 2020 occured).

(a) March - First week
(b) March - Second week
(c) March - Third week
(d) March - Fourth week
Figure 13: Weekly maps of the NO 2019/2020 relative change (%) for the working days (Monday - Saturday). Contour lines mark those areas where the relative change is statistically significant at the significance level of 0.05.
(a) April - First week
(b) April - Second week
(c) April - Third week
(d) April - Fourth week
(e) April - Fifth week
Figure 19: Weekly maps of the NO 2019/2020 relative change (%) for the working days (Monday - Saturday). Contour lines mark those areas where the relative change is statistically significant at the significance level of 0.05.
(a) March
(b) April
Figure 22: Distribution of the NO 2019/2020 relative change (%) for the working days (Monday - Saturday) maps.
Figure 23: Percentage of cells in the prediction maps for working days (Monday - Saturday) with positive (green color scale) and negative (blue color scale) 2019/2020 relative changes. Darker colors are used for significant variations.

5 Conclusions

Capturing the spatio-temporal variation in pollutant concentrations is of keen interest for the air pollution management’s community. One of the reasons is that the effectiveness of environmental policies or interventions, or even exceptional events as the COVID-19 lockdown, must be evaluated across the whole study domain, in order to understand if the change is homogeneous in space or peculiar spatial patterns occur. This analysis is made even more difficult by the fact that air pollution dynamics strongly depends on weather conditions; thus, it is necessary to disentangle the effect of the adopted intervention by controlling for the known confounding factors. Chemical transport model can be used for this purpose. They can assess, with a very detailed spatial and temporal resolution, variations due to changes in the emission burden arising from interventions, like the implementation of air quality plans. However, some drawbacks exist: CTMs need very detailed input data (e.g., emission inventories, meteorology data), are complex to be run and rely on advanced IT infrastructure for their simulations. More importantly, they do not provide an estimate of the uncertainty of the final predictions which are also strongly dependent on the initial and boundary conditions. Moving to statistical models, it is straightforward to deal with uncertainty and spatio-temporal correlation structures, while keeping the computational costs at a reasonable level (especially if computationally efficient estimation methods, like the INLA-SPDE approach, are used).

In this paper we propose the use of a statistical Bayesian spatio-temporal model as a novel tool for assessing - in time but more importantly in space - the effectiveness of air quality policy interventions. As a case study, we focused on how the COVID-19 lockdown affected air pollution levels in the north/centre of Italy, by means of the 2019/2020 NO concentrations relative change, the main output of our modeling strategy. Even if our proposal does not explicitly simulate physical and chemical reactions in the atmosphere (like CTMs do), it has two key advantages: it is parsimonious in terms of data needs and, possibly more important, it naturally manages the uncertainty associated to the parameters and map outputs. The proposed method, applied to a real-world experiment, demonstrated its capability to provide a reliable picture of the temporal and spatial patterns of the NO variations (compared to 2019), while accounting for the effect of meteorology.

In this study, the spatio-temporal variation of the NO concentrations in March and April 2020, as compared with the same period in 2019, is illustrated through meteorology-normalized spatially continuous maps at weekly intervals, both for working days (Monday - Saturday) and weekends (Sunday). Our results show that during March and April 2020 the study domain was generally characterized by negative relative changes in the NO concentrations with median values around -25%. Such estimate seems to be reasonably consistent with previous findings about NO levels during the lockdown in Europe. However, the message from our output maps is richer than what a simple number can describe. First, a visual inspection of the maps shows that statistically significant reductions mainly occurred in the urban areas of the Po valley and Tuscany, while not significant variations persisted over the mountainous regions of Valle d’Aosta and the Autonoumous Provinces of Bolzano and Trento. Second, and more interestingly, our weekly analysis of NO highlight that in March 2020 such reduction was synchronous with the lockdown measures adopted, while in April 2020 the concentrations, as compared to 2019, started to recover in some parts of the investigated area. To the authors of this study, this result comes not unexpected as it reflects the behaviour of the input data seen in the parallel plot of Figure 2. At the same time, it comes as a surprise that none of the statistical studies we examined documented an increase in the NO concentrations in April in the studied Italian domain. However, we are aware that previous findings are based on different models, data, and methodologies, so our results are not directly comparable, especially if we consider that all these studies do not provide continuous maps as an output.

It is worthwhile to observe that our modeling approach does not allow us to draw causal inference conclusions. However, given that the maps account for the weather effect, we can conclude that the reductions we observe across regions and weeks can be attributed to a factor different from the meteorology. Given that they occur in the same period of the restrictive measures, it is likely that the COVID-19 lockdown had an effect in the reduction of air pollution. Nevertheless, we can not exclude that other factors, not considered in the model, could have ad an active role in the dynamics of NO.

The model described in this paper takes inspiration from the model introduced in FIORAVANTI2021 for the spatio-temporal assessment of the daily log-transformed PM concentrations in Italy. As here our objective was to describe the relative changes across the two considered years (2019 vs 2020), in this study the target variable is the daily differences of the log-transformed NO concentrations. As far as we know this modeling strategy has never been used before in the context of spatio-temporal models for intervention analysis. Other approaches could have been adopted, still in the context of spatio-temporal modeling. For example, the same problem could have also been tackled by, first, jointly modeling the 2019 and 2020 NO concentrations and then computing the differences between the daily gridded predictions. However, this alternative solution has two important shortcomings: 1) it is not consistent with the AR(1) assumption, as our observations are not consecutive in time but are one year apart (March-April 2019 and 2020); 2) it is more time consuming as the size of the input dataset would be as twice as the size of the input dataset when daily differences are considered. Furthermore, the use of daily differences alleviates multicollinearity among regressors and allows to include more parameters in the regression equation. For this reason we believe that the spatio-temporal model we propose represents a valid solution to analyze pollutant concentrations measured by a network of monitoring stations before and after a given event of interest (e.g., the lockdown intervention). Interestingly, our model allows a straightforward spatial assessment of NO variations through high-resolution estimates, since the posterior predictive distributions of NO differences can virtually be derived for every pixel within the study domain. This is of keen interest for policy makers involved in intervention analysis or epidemiologist assessing population exposure change and gradients, since the assessment at urban hot spot, the between-city variability as well as the comparison among rural and urban context can be achieved at the finest resolution, with a reasonable computational effort. In this regard, our approach could be usefully replicated whenever the effects of an intervention aimed to reduce the pollutant emissions at local or regional scale have to be assessed. Last but not least, the proposed model is simple and can be easily extended to a larger spatial domain. This means that it could be also applied to other pollutants or, more generally, spatio-temporal phenomenon which are continuous in space and for which it is interesting to evaluate the change in consecutive years.

6 Appendix

(a) March - First week
(b) March - Second week
(c) March - Third week
(d) March - Fourth week
(e) March - Fifth week
Figure 29: Weekly maps of the NO 2019/2020 relative change (Sunday). Contour lines mark those areas where the relative change is statistically significant at the significance level of 0.05.
(a) April - First week
(b) April - Second week
(c) April - Third week
(d) April - Fourth week
Figure 34: Weekly maps of the NO 2019/2020 relative change (Sunday). Contour lines mark those areas where the relative change is statistically significant at the significance level of 0.05.
(a) March
(b) April
Figure 37: Distribution of the estimated NO 2019/2020 relative change (%) for the Sunday maps.

7 Acknowledgments

This research has been carried out within the framework of the PULVIRUS partnership agreement between ENEA, ISPRA, SNPA and ISS.

We wish to thank for their helpful contribution to the air quality and meteorological data pre-processing Raffaele Morelli, Walter Perconti and Arianna Trentini. In addition, we wish to thank all the Regional Environmental Protection Agencies (ARPA Valle d’Aosta, ARPA Piemonte, ARPA Lombardia, ARPA Veneto, APPA Bolzano, APPA Trento, ARPA Friuli, ARPAE Emilia Romagna, ARPA Toscana) that manage the air quality networks and provide fully validated NO hourly data.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author contributions

Guido Fioravanti, Michela Cameletti and Sara Martino conceived the presented model. Guido Fioravanti prepared the data with the support of Giorgio Cattani and performed computations. Michela Cameletti took the lead in writing the manuscript. Sara Martino verified the analytical methods. Giorgio Cattani and Enrico Pisoni contributed to the interpretation of the results. All authors discussed the results and contributed to the final version of the manuscript.

Financial disclosure

None reported.

Conflict of interest

The authors declare no potential conflict of interests.