Recurrent U-net: Deep learning to predict daily summertime ozone in the United States

08/16/2019 ∙ by Tai-Long He, et al. ∙ UNIVERSITY OF TORONTO University Corporation for Atmospheric Research NASA Carnegie Mellon University USTC 1

We use a hybrid deep learning model to predict June-July-August (JJA) daily maximum 8-h average (MDA8) surface ozone concentrations in the US. A set of meteorological fields from the ERA-Interim reanalysis as well as monthly mean NO_x emissions from the Community Emissions Data System (CEDS) inventory are selected as predictors. Ozone measurements from the US Environmental Protection Agency (EPA) Air Quality System (AQS) from 1980 to 2009 are used to train the model, whereas data from 2010 to 2014 are used to evaluate the performance of the model. The model captures well daily, seasonal and interannual variability in MDA8 ozone across the US. Feature maps show that the model captures teleconnections between MDA8 ozone and the meteorological fields, which are responsible for driving the ozone dynamics. We used the model to evaluate recent trends in NO_x emissions in the US and found that the trend in the EPA emission inventory produced the largest negative bias in MDA8 ozone between 2010-2016. The top-down emission trends from the Tropospheric Chemistry Reanalysis (TCR-2), which is based on satellite observations, produced predictions in best agreement with observations. In urban regions, the trend in AQS NO_2 observations provided ozone predictions in agreement with observations, whereas in rural regions the satellite-derived trends produced the best agreement. In both rural and urban regions the EPA trend resulted in the largest negative bias in predicted ozone. Our results suggest that the EPA inventory is overestimating the reductions in NO_x emissions and that the satellite-derived trend reflects the influence of reductions in NO_x emissions as well as changes in background NO_x. Our results demonstrate the significantly greater predictive capability that the deep learning model provides over conventional atmospheric chemical transport models for air quality analyses.



There are no comments yet.


page 5

page 6

page 8

page 16

page 17

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Tropospheric ozone is a major air pollutant and a greenhouse gas. It is produced photochemically by the oxidation of hydrocarbons in the presence of nitrogen oxides (NOx = NO + NO). As a consequence, ozone abundances near the surface are at a maximum in summer. Due to its high oxidative capability, high abundances of ozone near the surface are associated with adverse impacts on human health and crop yield. There are significant variations in surface ozone in the United States on both short and long time scales reflecting the influence of meteorology, non-linearity in the ozone chemistry, and changes in the emissions of ozone precursor gases. Atmospheric models used to simulate the distribution of ozone typically do not reproduce the observed long-term trend in tropospheric ozone. Furthermore, these models tend to overestimate summertime surface ozone abundances in the United States. For example, in an evaluation of 16 global models and one hemispheric model, Reidmiller et al. [38] found that the models overestimated summertime maximum daily 8-hour average (MDA8) ozone ozone in the eastern United States by 10–20 ppb. Recently, Travis et al. [48] found that the high-resolution regional version of the GEOS-Chem model has a MDA8 bias of +5 ppb, which is reduced to +1 ppb when the model evaluation is restricted to afternoon hours under dry conditions, with proper accounting for the vertical gradient of ozone in the lowest model layer. The Travis et al. [48] results highlight the challenges of using conventional chemical transport models for air quality applications. Here, we apply a hybrid deep learning model to predict June-July-August (JJA) MDA8 ozone in the United States using meteorological and chemical predictors. Compared to existing atmospheric models, the deep learning approach offers superior predictive capability for summertime ozone, better accounting for the coupling between meteorology and emissions [10].

Stagnant weather conditions and high surface temperatures are favorable for the occurrence of extreme surface ozone episodes in summer. Previous studies have used statistical methods to investigate the relationship between large-scale atmospheric circulation patterns and summertime surface ozone [11, 42]. Recent achievements in deep learning [14]

over the past few years show that empirical models are able to learn both spatial and temporal patterns in the input data. The application of deep learning models has achieved great success in computer vision and natural language processing. As discussed in

[37], deep learning approaches have the potential to improve our predictive ability and understanding in a wide range of challenges we have in Earth science. There have been several previous studies using deep learning models for classification problems in the Earth Sciences [25, 2, 13, 5]. For example, Liu et al. [31] used a deep convolutional neural network to detect extreme events in a climate dataset. However, not many state-of-the-art deep learning architectures have been employed for predictive applications in the Earth Sciences. In this study, we propose a deep learning model that is able to learn the spatiotemporal dynamics of summertime ozone in the United States. We show that the model captures well both long-term and short-term variability in MDA8 ozone over the United States, and that the model is able to provide predictions where no in situ observations are available. We also utilize the fitted model to conduct a qualitative analysis of the teleconnections between MDA8 and its predictors.

Atmospheric NOx is a key precursor of tropospheric ozone. Due to air pollution regulations, NOx emissions have declined significantly since the 1990s. However, there is uncertainty about the recent trends in NOx emissions in the United States. NOx emission estimates inferred from satellite observations (referred to as top-down estimates) suggest that there has been a slowdown in the reduction rate since 2009, compared to the bottom-up emission inventory reported by the US Environmental Protection Agency (EPA)

[20]. However, it has also been suggested [44] that the slowdown in the reduction rate in the satellite-derived emission estimates does not indicate a discrepancy with the EPA emission inventory, but instead is due to the increasing relative influence of non-anthropogenic NOx emissions on atmospheric NOx as captured by the satellite measurements. It was also suggested in [27] that the satellite-derived trends are consistent with the trends in surface observations of NOx in high emission regions and that the discrepancy between the top-down and bottom-up trends are due to non-linearity in the relationship between NOx emissions and the satellite observations of NO in rural regions. Here we use the deep learning model to evaluate the recent trends in NOx emissions in the United States. The deep learning model is independent of the chemical errors that are typically found in atmospheric chemical transport models used in this type of evaluation (e.g. [44]). We demonstrate that the model is therefore an ideal tool for air quality assessment studies.

2 Summertime ozone predictors

Large-scale atmospheric circulation patterns, sea surface temperatures (SSTs), and sea level pressure (SLP) have an impact on year-to-year variability of summertime ozone in the eastern United States [42]. The dynamical patterns in these meteorological fields in the previous spring have been shown to have relatively high correlation with summertime MDA8 in the eastern United States, and could be used as predictors in statistical models. Reductions in NOx emissions have driven the negative trend in surface ozone in the United States [6]. Simulations suggest that the 95th percentile of summertime ozone would have slightly increased over this period (1990–2010) in absence of NOx emission regulations [30]. We have therefore selected the following MDA8 ozone predictors, focusing on the June-July-August (JJA) period: anthropogenic emissions of NOx and non-methane volatile organic compounds (NMVOCs), mean sea level pressure (MSLP), geopotential at 500 hPa level (Z), downward shortwave radiation (SSRD), sea surface temperature (SST), 2-meter temperature (T2M), and 2-meter dew point (D2M). The NOx emissions are separated into seven emissions sectors that agriculture (AGR), the power industry (ENE), the manufacturing industry (IND), residential and commercial (RCO), international shipping (SHP), surface transportation (TRA), and waste disposal (WST). The sector-based NOx emissions provide geospatial information to the neural networks, which helps with the regression and localization of ozone levels.

3 Data

The meteorological data are from the ERA-Interim reanalysis [7] from the European Centre for Medium-Range Weather Forecasts (ECMWF), which have been regridded to a horizontal resolution of . The NOx emissions are from the Community Emissions Data System (CEDS) inventory [18]. All the data are cropped to a regional domain extending from 0N to 72N, and from 180W to E to encompass the North Pacific and the North Atlantic, where strong linkages were found between ocean forcing and summertime climate in the eastern United States [42, 45, 46, 12].

MDA8 ozone was estimated from ozone measurements from the EPA Air Quality System (AQS) ( The MDA8 ozone were aggregated to grid boxes. With a 37-year AQS MDA8 record extending from 1980 to 2016, we used JJA data from 1980 to 2009 as the training data set. During the training process, the last 15% of the data were used as a validation data set, in order to prevent overfitting the model. The MDA8 data from 2010 to 2014 were used as the testing set to evaluate the performance of the model.

4 A deep learning model to predict MDA8

Most of modern deep learning models are built using convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNNs are the most fundamental model in computer vision and are able to efficiently learn spatial correlations in data. The spatial correlation is captured by convolutional filtering processes in the CNN layers (described in Appendix


). Another operation that is frequently applied is max pooling, which is similar to convolutional filtering, except that the convolution is replaced by a simple max transformation (described in Appendix


) that is used to further reduce data dimensionality and to extract dominant features. CNNs can be used as encoders that project vectors from a high-dimensional input space to a low-dimensional latent space. After each convolutional layer, the data is extracted and compressed to higher dimensional latent vectors, which are also called features or activations. A schematic of the model, which is a fully convolutional network (F-CNN), is given in Fig.

1. The input layer has 13 channels for the ozone predictors. The input information gets compressed by eight convolutional blocks and three max pooling layers to extract the dominant features in the input data.

To capture the dynamics in the data we rely on the RNNs (described in Appendix C). These were developed for sequential forecasting problems [41], and without them the dependency on the previous model state cannot be used in the later analysis. The RNN model used in this study is the long-short term memory (LSTM) cell [16], which is used to capture temporal correlations hidden in the data. In this study, the dynamics captured by the LSTM model includes both short-term daily variability and long-term trends in MDA8 ozone. The idea of combining the LSTM and CNN was first proposed by Shi et al. [43] for better nowcasting of precipitation. Their model showed that the hybrid deep learning model is capable of capturing short-term dynamics, which is quite challenging in the field of weather forecasting. For the prediction of summertime ozone in the United States, the problem is also challenging due to the coupling of the local meteorological conditions, the non-linear ozone chemistry, and the long-term trends in ozone. Consequently, to better capture the summertime ozone dynamics, we made the model deeper by stacking 3 LSTM cells in series.

After the input information gets compressed by the convolutional blocks, and the dynamics are captured by the LSTM cells in the final latent vector, the compressed information is projected to the output layer via a decoder that consists of a sequence of transposed convolutional layers and upsampling layers (as shown in Fig. 1). Following Ronneberger et al. [39], we added skip connections (as show in Fig. 1

) that forward the high-resolution features extracted by the encoder to the decoder for better localization of the features learned by the deep learning model. High-resolution features extracted from the encoding layers are fused during the decoding process of the latent vector, which helps with the localization of important patterns

[39]. Without these operations of feature fusion, the deep learning model will have difficulty in determining the most important regions where predictions should be made. The whole deep learning model has about 55 million trainable parameters, which are the convolutional kernels in the CNNs and the weights and biases in the RNNs and fully connected layers.

The cost function to be optimized is defined as the mean squared error calculated in each grid box as follows:


where and are the predicted and observed MDA8 ozone. The square of the Pearson correlation coefficient between predictions and observations is also used as an auxiliary metric of model performance,


where is the mean of observed JJA MDA8. This performance evaluation is only computed in grid boxes where AQS measurements are available. This way the optimization of the model is not influenced by the imperfect observational coverage of the AQS data. The back-propagation algorithm is used to train the neural network (see Appendices A and C), with the ADAM optimization algorithm for a faster convergence [23].

Figure 1: Deep learning model to predict JJA MDA8. The model consists of an input layer with 13 channels for the ozone predictors, 8 convolution and 3 max pooling layers to extract the dominant features in the data, and 3 stacked LSTM cells to capture the dynamics in the data. Compressed data are then passed to transposed convolution layers for projection to the output layer. The two arrows at the top indicate the skip connections that forward the high-resolution features extracted by the encoder to the decoder for better localization of the features.

5 Evaluation of the model performance

Figure 2: Observed (top left) and predicted (top right) mean JJA MDA8 ozone during 2010-2014. Also shown (bottom left) is the absolute error (in ppb) for the predicted minus observed MDA8 ozone. The boxes indicate the domains for the CONUS (green box), the northeastern (blue box) and southeastern (red box) United States, and the West Coast (yellow box) that are used in the regional analysis. Model predictions are made everywhere in the United States, but the errors are calculated only where the AQS observations are located.
Figure 3: Observed (orange line) and predicted (blue line) daily (top row), 7-day averaged (middle row), and 30-day averaged (bottom row) JJA MDA8 ozone (in ppb) during training of the model (1980-2009). Show are the time series for the CONUS (first column), the northeast (second column), the southeast (third column), and the west coast (last column). The regional definitions are shown in Fig. 2.
Figure 4: As in Fig. 3, but for the testing data set (during 2010–2014).

After training the model using data from 1980 to 2009, we evaluate its performance using data from the subsequent five years. As shown in Figure 2, the predicted MDA8 ozone concentrations are in good agreement with the AQS ozone observations. However, the mean errors show regional differences. The largest discrepancies are in the central United States and in the Intermountain West, where the model underestimates MDA8 ozone. The model also overestimates ozone in parts of the eastern United States by 2–4 ppb.

We find that the deep learning model is able to capture both the short-term and long-term dynamics of MDA8 ozone well. The time series of observed and predicted daily MDA8 ozone in the training data (for 1980 – 2009) are plotted in Figure 3 for the four regions shown in Figure 2 (the contiguous United States (CONUS), the northeastern and southeastern United States, and the west coast). For the CONUS, the model account for 96% of the variability in the training data. The time series of testing data (for JJA 2010 – 2014) are plotted in Figure 4 and the error statistics are given in Table 1. The model is able to capture 85%, 86%, and 76% of the variability of MDA8 in the northeastern, southeastern, and western United States, respectively. In contrast, Reidmiller et al. [38] found that atmospheric models captured only 59%, 71%, and 46% of the JJA MDA8 ozone variability in the Northeast, Southeast, and California, respectively.

The mean error is less than 1 ppb in the eastern United States, which is significantly smaller than the 10–20 ppb by which conventional model simulations typically overestimate JJA MDA8 ozone in the eastern US [38]. Across the CONUS, the model underestimates MDA8 ozone by about 1 ppb. The large negative bias on the West Coast (see Table 1) and in the central United States (Fig. 2) could be due to the absence of emissions of NOx from soils in the predictors, which have been shown to be a major source of NOx in these regions [19, 1].

The square of the correlation coefficient for the MDA8 ozone in each grid box is plotted in Fig. 5. The predicted MDA8 ozone over the United States have ubiquitously high correlations () with the observations. However, low correlations are found in the Intermountain West (), where there are fewer AQS observations. Also, this region is strongly influenced by free troposphere background ozone abundances rather than local or regional precursor emissions [50]. Including wind fields and wildfire emissions as additional predictors may improve the predictability of MDA8 ozone in the Intermountain West, as wildfires and transport from California could have a large impact. The year-to-year variability of surface ozone is also shown to be related to stratospheric intrusions in spring [50, 29] and the emissions of NOx from lightning in summer [50]. Thus, incorporating meteorological fields related to stratospheric intrusions and lightning could potentially provide further improvement. For remote regions like Hawaii, Alaska, and Puerto Rico, shown in Fig. 5, the correlation is limited by the lack of sufficient observations.

We analyzed the extracted information from the fitted deep learning model (see Appendix D) and found that the patterns in the feature map show teleconnections between the meteorological fields, particularly over the Pacific and Atlantic oceans, and MDA8 ozone. Analysis of the NOx emission predictors indicate that the NOx emission sectors provide regional information in discriminative importance. To demonstrate the importance of the meteorological fields for capturing the ozone variability we conducted an experiment in which we trained the model with only the meteorological predictors. The results of this experiment are shown in Fig. 6 and Table 1. Using only the meteorological predictors the model captures the ozone variability as well when the NOx emissions are included. For the CONUS, the model predicted MDA8 ozone with with only the meteorological predictors, compared to with the meteorological and NOx emission predictors. However, without accounting for the reductions in NOx emissions, the model overestimates ozone by about 7 ppb in the eastern US and by 4-5 ppb across the CONUS (as indicated in Table 1). We note that even with this degraded performance, the fidelity of the deep learning model is still better than most atmospheric models. Including NOx emissions as a predictor is critical for capturing the long-term trend in MDA8 ozone.

The oxidation of volatile organic compounds (VOCs), in the presence of NOx, is a source of ozone that is not explicitly accounted for in the deep learning model. This is a particularly important source of ozone in the southeastern US, where there are large biogenic emissions of VOCs. But estimates of biogenic VOCs are highly uncertain. The widely used Model of Emissions of Gases and Aerosols from Nature (MEGAN), for example, overestimates biogenic VOCs in the southeastern United States [32]. As a result, we chose not to include biogenic VOCs as a predictor in the model. However, biogenic VOC emissions are influenced by local meteorological conditions and we expect that some of the ozone variability induced by these emissions will be captured by the meteorological predictors.

In regions of high NOx emissions, ozone production will increase with emissions (biogenic and anthropogenic) of VOCs and decrease with increasing NOx emissions. This VOC-limited regime will not be captured by the model since it does not include VOC emissions as a predictor. However, we note that the main urban regions in United States, with the exception of some urban cores, are NOx-limited (ozone production increases with increasing NOx emissions) in summer, and that these regions have become more NOx-limited since 2005 as a result of reductions in NOx emissions  [8, 21]. Furthermore, at the coarse resolution of , the deep learning model will not capture these VOC-limited urban regions. Thus, neglecting VOC emissions here should not adversely impact the performance of the model regarding changes in VOC-limited and NOx-limited regions.

Figure 5: Correlation () between the observed and predicted MDA8 ozone in each grid box during the testing period (2010–2014).
Figure 6: As in Fig. 4, but for the experiment in which only the meteorological predictors were used in the DL model.
Meteorological and NOx Predictors Meteorological Predictors
NOx trend Mean Error (ppb) Mean Error (ppb)
Northeastern US
Southeastern US
West coast
Table 1: Regional error statistics for the model evaluation in the period of 2010–2014 for the model configured with the meteorological and NOx emissions predictors and for the experiment using only the meteorological predictors.

6 Trend of anthropogenic NOx emissions over the United States after 2010

Figure 7 shows the trend in the annual mean NOx emissions from the EPA bottom-up inventory as well as from top-down emission estimates from Jiang et al. [20] and the Tropospheric Chemistry Reanalysis (TCR-2) [33, 34, 35]. Compared to Jiang et al., TCR-2 used an updated data assimilation system and improved satellite retrievals. As can be seen, there is good agreement in the NOx emission trend in the different inventories between 2005, when the top-down inventories became available, and 2010. However, after 2010 the top-down inventories suggest a significant slowdown in the rate reduction of NOx emissions in the United States [20]. Included in Fig. 7 is the trend in surface NO from observations from the EPA AQS network. The AQS NO trend suggests a smaller reduction in NOx emissions than the EPA inventory between 2005-2010, but not as pronounced as the slowdown observed in the top-down inventories.

Evaluating these emission trends using conventional atmospheric chemical transport models is challenging due to the fact that those models are impacted by deficiencies in the employed chemical mechanisms and dynamical parameterizations. The deep learning model captures the physical and chemical mechanisms between MDA8 and its predictors based on the input in situ and meteorological data only, and is able to mitigate the impact of a majority of sources of error in conventional atmospheric models. Here we use the deep learning model to assess the consistency of the bottom-up and top-down NOx emission inventories with observed MDA8 ozone.

To evaluate the trends in the NOx emissions, we use the trained model to predict MDA8 ozone from 2010 to 2016 using the CEDS NOx emissions scaled by the different annual trends shown in Fig. 7. The CEDS inventory is scaled as follows:


where is the CEDS emissions in month , is the annual scaling factor that captures the trend shown in Fig. 7 for a given inventory , and is the resulting scaled NOx emissions used in the model prediction of MDA8 ozone.

The time series of predicted and observed MDA8 ozone are plotted in Fig. 8 and the error statistics are shown in Table 2. The observed AQS NO trend results in a mean error of ppb across the United States, which is statistically indistinguishable from the standard results obtained with the CEDS inventory (in Fig. 4). In contrast, the EPA trend results in a larger bias of ppb, which is clearly visible in Fig. 8. The EPA trend also results in the largest root-mean-square errors (RMSE) (see Fig. 8). The TCR-2 trend produces the smallest error relative to the MDA8 ozone observations.

The greater consistency of the top-down trends with the surface ozone observations could be due to the fact that inversion analyses that produced these emission estimates incorporated satellite observations of tropospheric ozone and NO, so the inferred NOx emissions reflect the trends in both tropospheric ozone and NO. We note that in a sensitivity analysis (see Appendix E) in which we retrained the model using data from 1980-2005 and predicted MDA8 ozone for 2005-2016, the predicted ozone based on the bottom-up and top-down trends were consistent with each other and with the observed AQS ozone between 2005-2010. After 2010, however, the EPA trend produced the largest negative bias in predicted ozone, whereas the top-down trends were in better agreement with observed AQS ozone observations.

It was suggested [44, 27] that the discrepancy between the top-down and bottom-up NOx emission estimates could be due to the fact that the satellite observations of NO are more representative of non-anthropogenic NOx in rural regions after 2010. In Table 3 the error statistics for the model predictions for 2010–2016 are aggregated in "urban" and "rural" regions, defined according to whether NOx emissions in a given grid box are greater than or less than molec cm s, respectively, following Li and Wang [27]. The NOx emissions scaled by the observed AQS NO trend produce ozone predictions with the smallest errors and highest correlations in the urban regions. For the rural regions, the best performance is obtained with the top-down NOx trends, which show smaller differences between urban and rural regions. Our results agree with previous studies [44, 27], suggesting that the top-down NOx trends are affected by a combination of anthropogenic NOx emissions and rural NOx conditions. Our results also show that the bottom-up EPA trend is more inconsistent with observed ozone than either the AQS or top-down trend.

Figure 7: Relative change (normalized to 2005) in annual mean anthropogenic NOx emissions for the United States from the bottom-up EPA inventory (blue line) and from the top-down inventories from TCR-2 (green line) and Jiang et al. [20] (red line). Also shown is the trend in AQS NO measurements (orange line).
Figure 8: Top row: Observed and predicted daily mean (top left) and monthly mean (top right) MDA8 ozone between 2010–2016 (2010–2015 for Jiang et al.). Shown are the AQS ozone observations (black line) and the model predictions based on the NOx emissions scaled by the EPA (blue line), AQS NO (orange line), TCR-2 (green line), and the Jiang et al. (red line) trends. Bottom row: The monthly RMSE (bottom left) and monthly mean errors (bottom right) for the MDA8 ozone predictions in the first row.
NOx trend Mean Error (ppb)
Jiang et al.
Table 2: MDA8 ozone error statistics for the CONUS for 2010–2016 (2010–2015 for Jiang et al.).
Urban Rural
NOx trend Mean Error (ppb) Mean Error (ppb)
Jiang et al.
Table 3: Urban and rural MDA8 ozone error statistics for the CONUS for 2010–2016 (2010–2015 for Jiang et al.).

7 Conclusions

We have developed an ozone prediction system based on state-of-the-art deep learning models to predict summertime daily MDA8 ozone in the United States. The model uses 13 predictors, including large-scale meteorological variables and sector-specific anthropogenic emissions of NOx. The model was trained with observed summertime MDA8 ozone data from 1980 to 2009 and tested with data from 2010 to 2014. We found that the model captured well the daily variability in MDA8 ozone across the United States, predicting ozone with and a mean error of ppb. Regionally, the model has high predictability of ozone in the eastern United States and on the west coast (), but low predictability in the Intermountain West (). The negative bias in largest in the west coast (with a mean error ppb), whereas the model has a slight positive bias (less than ppb) in the eastern United States. Feature maps show that the deep learning model captures the teleconnections between MDA8 ozone and the meteorological predictors, which are responsible for driving the daily, seasonal, and interannual variations in predicted ozone. The NOx emissions are important for capturing the long-term negative trend in surface ozone in the United States.

We used the model to evaluate recent trends in NOx emissions after 2010 in the context of the model predictions of surface ozone. We found that between 2010 to 2016 the trends in NOx emissions in the bottom-up EPA inventory resulted in the largest underestimate of MDA8 ozone across the United States (with a mean error ppb). In contrast, the trend in NOx emissions consistent with the trend in AQS NO observations resulted in summertime MDA8 ozone predictions that were in better agreement with the AQS ozone observations (with a mean error of ppb), and were consistent with those obtained with the CEDS bottom-up inventory. The best agreement with the AQS ozone observations were obtained with the top-down NOx emission trends from TCR-2 [33, 34, 35] and Jiang et al. [20]. Examination of the error statistics aggregated into urban and rural regions revealed that in urban regions the AQS trend provided ozone predictions in agreement with observations, whereas in rural regions the top-down trends produced the best agreement with observations. Our results suggest that the top-down trends are capturing changes in anthropogenic and non-anthropogenic NOx after 2010. The EPA trend produced the largest errors in ozone in both urban and rural regions, suggesting that the EPA inventory is overestimating the reduction in NOx emissions after 2010.

This deep learning architecture is generic. It can be utilized to realize other high-dimensional predictions, given the spatial and temporal dynamics in the data. Although we aggregated the MDA8 ozone data to a resolution of

, the model is able to deal with various spatial resolutions. Despite the greater predictive skill of our deep learning approach compared to conventional atmospheric chemical transport models, other modern machine learning algorithms have the potential for even greater gains in performance. For example, deep Gaussian process (GP) models are frequently used in modelling temporal dynamics. To mitigate the impact of limited observational coverage, generative adversarial networks (GANs) could be used to interpolate data gaps using the learned correlations between predictions and predictors. These algorithms could offer superior performance for air quality and other Earth Science applications.

8 Acknowledgments

This work was supported by the Natural Sciences and Engineering Research Council of Canada. Computations were performed on the Graham supercomputer of Compute Ontario and Compute Canada. Part of this work was conducted at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA).

Appendix A Convolutional neural networks and error back-propagation

For a set of two-dimensional kernel matrices K, the convolutional filtering process could be expressed as:


where the asterisk denotes the convolution operation, and and are the indices along the two major axes of the input data . The result of the convolution calculation

is transformed by the activation function

. and

denote the indexes along the major axes of the output tensor

, which is referred to as the feature map. is the kernel index.

The optimization of CNNs is done using the back-propagation algorithm [40, 24]. To back propagate error information through CNNs, the partial derivatives of cost function with respect to the convolutional and pooling kernels are:


where is the true output, and

denotes the entrywise differences between prediction and output. Here chain rule and the commutativity of convolution operation are applied. At each learning step, the kernels could be optimized along the gradients as:

where is the size of the learning rate, which could be optimized by a sensitivity test. We can see that the optimization of a convolutional layer could be expressed as another convolution operation.

Appendix B Max pooling layers

The max pooling process could be expressed as:


where and stand for the dimension of the max filter.

Appendix C Recurrent neural networks and error back-propagation

The flow of information in the LSTM cell is filtered by three gates: forget, input, and output gates. At each time step, these gates estimate prior errors based on the previous model state, and controls the amount of information from new observations to be stored in the cell state.

The feed-forward process of the LSTM network at time could be expressed as:


where and stand for the input and output vectors at time . and denote the prior and posterior model states, which were designed to specifically characterize the latent state of dynamics.

The first three equations represent the computation made at forget, input and output gates, respectively. The input vector at time , , is concatenated with output vector from previous analysis to form an augmented state, which is then transformed by the corresponding weight and bias term at each gate. These gates are normalized by the sigmoid activation function to be between 0 and 1, where 0 means the gates are blocking everything and vice versa. The forget rate and input rate could be thought of as the estimated errors associate with previous posterior model state and the prior model state at current time step . And the new posterior model state is then computed by solving Equation (12). The final posterior analysis is computed by activating the posterior model state, and is weighted by the output rate . Here the all the indexes are omitted, and all the operation is entrywise.

To embed the LSTM cells with other neural networks, the back propagation of error gradients has to be formulated. The training algorithm used for LSTM units is called back propagation through time (BPTT):

where denotes the output error at time . The training starts from the cost function calculated at last time step and propagates backwards. The optimization process will run iteratively along the whole data sequence, until the cost function is fully minimized.

Appendix D Analysis on teleconnections between ozone predictability and large-scale circulation patterns

The convolutional kernels in each layer in the deep learning model control the amount of information to be passed into subsequent layers, which are optimized along the gradients of the cost function. Although the features extracted by the convolutional layers are unitless and difficult to directly relate to physical parameters, they could be used as high-dimensional representation of hidden correlations with the model output. We plot in Fig. 9 the the feature maps from the Conv2, Conv4, Conv6, and Conv8 convolution layers from the encoder part of the deep learning model illustrated in Fig. 1. Throughout the process of information compression, consistent strong features are located over the continent, corresponding to the local influence of NOx emissions. In Fig. 8 (a) and (b), some activated features could also be found in the northeastern Pacific Ocean, which represents teleconnections between large-scale circulation patterns and MDA8 ozone [42, 45, 46, 12].

Figure 9: Feature maps from the 2nd, 4th, 6th, and 8th convolutional layers as illustrated in Fig. 4, averaged within all training data from 1980 to 2009.

Feature maps from the first convolutional layer are of particular interest because they represent the sensitivity of the predicted JJA MDA8 ozone, and could be distinguished between each input predictors. The features can be utilized to analyze the discriminative sensitivity of the deep learning model to each ozone predictor. Inspired by [49, 51], we compute the channel-wise activation maps (ChAM) to analyze the significance of each ozone predictor in ozone predictability, which is defined as



where is the index of ozone predictors, is the index of convolutional kernels, and denotes the convolutional operation. We compute ChAM using the testing data set and plot the average feature maps for the meteorological predictors in Fig. 10. We find that for Z, MSLP, SSRD and SST, the central Pacific Ocean and the Atlantic Ocean are the most discriminative regions for ozone predictability. For D2M and T2M, the largest features are mainly located over the southwest and the Gulf Coast. Remarkably, even if SST is only defined over the ocean, the convolutional operation has a smearing effect and the model can still capture the continental teleconnections between MDA8 and SST. All meteorological predictors show sensitivity around Hawaii and in the Atlantic, which coincides with previous study on teleconnections between MDA8 with SST and MSLP[42].

Figure 10: Averaged feature maps for the meteorological predictors, calculated from the training data set.

Unlike meteorological predictors, the activations of NOx emissions are very local, which is likely due to the short lifetime of NOx near the surface. Fig. 11 shows the ChAM for the NOx emission sectors. The NOx emissions related to agriculture and waste handling are mainly affecting ozone predictability in the southern US, although on average they have less activation in the deep learning model compared to other sectors. Overall, ENE, IND and TRA are the most important contributors to ozone predictability.

Figure 11: Average activation maps for NOx emissions, calculated from training data set. The seven NOx emission sectors are as follows: (a) agriculture (AGR), (b) the power industry (ENE), (c) manufacturing (IND), (d) residential and commercial (RCO), (e) international shipping (SHP), (f) surface transportation (TRA), and (g) waste disposal (WST).

Appendix E Restricting training period from 1980-2009 to 1980-2005

To evaluate the model for the 2005-2016, we retrained the model from 1980 to 2005, with the same other experimental settings as in Section 6, and tested it from 2005-2016. The time series of the predicted and observed MDA8 are plotted in Fig. 12, and the error statistics for 2005-2009 and 2010-2016 are given in Tables 4 and 5, respectively. Between 2005-2010, the MDA8 ozone predicted using the different NOx trends all show good consistency over the US. However, after 2010, the bottom-up trends of NOx resulted in an underestimation of MDA8 ozone relative to tat from the top-down trends. The divergence is clearly visibly in the time series of the monthly mean errors in Fig. 12, with the EPA-based trend clearly producing the largest RMSE and negative bias after 2010. The degraded performance for the 2010-2016 period relative to the results in Section 6 is due to the reduced length of the training period.

Figure 12: Top row: Observed and predicted daily mean (top left) and monthly mean (top right) MDA8 ozone between 2005–2015 (2005–2015 for Jiang et al.). Shown are the AQS ozone observations (black line) and the model predictions based on the NOx emissions scaled by the EPA (blue line), AQS NO (orange line), TCR-2 (green line), and the Jiang et al. (red line) trends. Bottom row: The monthly RMSE (bottom left) and monthly mean errors (bottom right) for the MDA8 ozone predictions in the first row.
NOx trend Mean Error (ppb)
Jiang et al.
Table 4: MDA8 ozone error statistics for the CONUS for 2005–2009.
NOx trend Mean Error (ppb)
Jiang et al.
Table 5: MDA8 ozone error statistics for the CONUS for 2010–2016 (2010–2015 for Jiang et al.).


  • [1] Almaraz, M., E. Bai, C. Wang, J. Trousdell, S. Conley, I. Faloona, B. Z. Houlton (2018) Agriculture is a major source of NOx pollution in California. Sci. Adv. 4, eaao3477.
  • [2] Benediktsson, J. A., Swain, P. H. & Ersoy, O. K. Neural network approaches versus statistical methods in classifcation of multisource remote sensing data. IEEE Trans. Geosci. Remote Sens. 28, 540–552 (1990).
  • [3] Bloomer BJ, Stehr JW, Piety CA, Salawitch RJ, Dickerson RR (2009) Observed relationships of ozone air pollution with temperature and emissions. Geophys Res Lett 36:L09803.
  • [4] Bond NA, Cronin MF, Freeland H, Mantua N (2015) Causes and impacts of the 2014 warm anomaly in the NE Pacific. Geophys Res Lett 42(9):3414-3420.
  • [5] Camps-Valls, G., Tuia, D., Bruzzone, L. & Benediktsson, J. A. Advances in hyperspectral image classifcation: Earth monitoring with statistical learning methods. IEEE Signal Process. Mag. 31, 45–54 (2014).
  • [6] Cooper OR, Gao RS, Tarasick D, Leblanc T, Sweeney C (2012) Long-term ozone trends at rural ozone monitoring sites across the United States, 1990–2010. J Geophys Res 117:D22307.
  • [7] Dee, D. P., et al. (2011), The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Q.J.R. Meteorol. Soc., 137: 553–597. doi: 10.1002/qj.828
  • [8] Duncan, B. N., Yoshida, Y., Olsen, J. R., Sillman, S., Martin, R. V., Lamsal, L., …Crawford, J. H. (2010). Application of OMI observations to a space-based indicator of NOx and VOC controls on surface O3 formation. Atmospheric Environment, 44(18), 2213-2223.
  • [9] Fiore AM, et al. (2009) Multimodel estimates of intercontinental sourcereceptor relationships for ozone pollution. J Geophys Res 114:D04301.
  • [10] Fiore, A. M., Naik, V., and Leibensperger, E. M.: Air Quality and Climate Connections, J. Air Waste Manage., 65, 645–685, doi:10.1080/10962247.2015.1040526, 2015.
  • [11] Gardner MW, Dorling SR (2000) Meteorologically adjusted trends in UK daily maximum surface ozone concentrations. Atmos Environ 34(2):171–176.
  • [12] Gill A (1980) Some simple solutions for heat‐induced tropical circulation. Q. J. R. Meteorol Soc 106(449):447-462.
  • [13] Gomez-Chova, L., Tuia, D., Moser, G. & Camps-Valls, G. Multimodal classifcation of remote sensing images: a review and future directions. Proc. IEEE 103, 1560–1584 (2015).
  • [14] Goodfellow I., et al. (2016) Deep Learning. MIT Press.
  • [15] Hahnloser, R.; Sarpeshkar, R.; Mahowald, M. A.; Douglas, R. J.; Seung, H. S. (2000). "Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit". Nature. 405: 947–951. doi:10.1038/35016072.
  • [16] Hochreiter S., and Jürgen Schmidhuber (1997). "Long short-term memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276.
  • [17] Hochreiter, S., Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, 2001.
  • [18] Hoesly, R. M., Smith, S. J., Feng, L., Klimont, Z., Janssens-Maenhout, G., Pitkanen, T., Seibert, J. J., Vu, L., Andres, R. J., Bolt, R. M., Bond, T. C., Dawidowski, L., Kholod, N., Kurokawa, J.-I., Li, M., Liu, L., Lu, Z., Moura, M. C. P., O’Rourke, P. R., and Zhang, Q.: Historical (17502014) anthropogenic emissions of reactive gases and aerosols from the Community Emissions Data System (CEDS), Geosci. Model Dev., 11, 369-408, 2018.
  • [19] Jaegle, L., L. Steinberger, R. V. Martin, K. Chance (2005) Global partitioning of NOx sources using satellite observations: Relative roles of fossil fuel combustion, biomass burning and soil emissions. Faraday Discuss. 130, 407-423.
  • [20] Jiang, Z., M. Brian, W. Helen, W. John, M. Kazuyuki, Q. Zhen, D. K. Henze, D. B. A. Jones, A. F. Arellano, E. V. Fischer, L. Zhu, and K. F. Boersma. (2018). Unexpected slowdown of US pollutant emission reduction in the past decade. Proceedings of the National Academy of Sciences. 115. 201801191. 10.1073/pnas.1801191115.
  • [21] Jin, X., Fiore, A. M., Murray, L. T., Valin, L. C., Lamsal, L. N., Duncan, B., …Tonnesen, G. S. (2017). Evaluating a space-based indicator of surface ozone-NOx-VOC sensitivity over midlatitude source regions and application to decadal trends. Journal of Geophysical Research: Atmospheres, 122, 10,439-10,461.
  • [22] Johnstone JA, Mantua NJ (2014) Atmospheric controls on northeast Pacific temperature variability and change, 1900-2012. Proc Natl Acad Sci USA 111(40):14360-14365.
  • [23] Kingma, D. P., & Ba, J. 2014, arXiv e-prints , arXiv:1412.6980.
  • [24]

    LeCun,Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, (1989). Backpropagation Applied to Handwritten Zip Code Recognition; AT&T Bell Laboratories.

  • [25] Lee, J., Weger, R. C., Sengupta, S. K. & Welch, R. M. A neural network approach to cloud classifcation. IEEE Trans. Geosci. Remote Sens. 28, 846–855 (1990).
  • [26] Li, J., Wang, Y., & Qu, H. (2019). Dependence of summertime surface ozone on NOx and VOC emissions over the United States: Peak time and value. Geophysical Research Letters, 46, 3540-3550.
  • [27] Li, J. and Wang, Y. (2019), Inferring the anthropogenic NOx emission trend over the United States during 2003-2017 from satellite observations: Was there a flattening of the emission tend after the Great Recession?, Atmos. Chem. Phys. Discuss.,, in review.
  • [28] Lin, M., Horowitz, L. W., Cooper, O. R., Tarasick, D., Conley, S., Iraci, L. T., Johnson, B., Leblanc, T., Petropavlovskikh, I., and Yates, E. L.: Revisiting the evidence of increasing springtime ozone mixing ratios in the free troposphere over western North America, Geophys. Res. Lett., 42, 8719–8728, doi:10.1002/2015GL065311, 2015b.
  • [29] Lin, M., A. M. Fiore, L. W. Horowitz, A. O. Langford, S. J. Oltmans, D. Tarasick, and H. E. Rieder (2015), Climate variability modulates western U.S. ozone air quality in spring via deep stratospheric intrusions, Nat. Commun., 6(7105), doi:10.1038/ncomms8105.
  • [30] Lin, M., Horowitz, L. W., Payton, R., Fiore, A. M., and Tonnesen, G.: US surface ozone trends and extremes from 1980 to 2014: quantifying the roles of rising Asian emissions, domestic controls, wildfires, and climate, Atmos. Chem. Phys., 17, 2943-2970,, 2017.
  • [31] Liu, Y. et al. Application of deep convolutional neural networks for detecting extreme weather in climate datasets. In ABDA’16-International Conference on Advances in Big Data Analytics 81–88 (2016).
  • [32] Millet, D. B., D. J. Jacob, K. F. Boersma, T. M. Fu, T. P. Kurosu, K. Chance, C. L. Heald, and A. Guenther (2008), Spatial distribution of isoprene emissions from North America derived from formaldehyde column measurements by the OMI satellite sensor, J. Geophys. Res., 113, D02307, doi:10.1029/2007JD008950.
  • [33] Kanaya, Y., Miyazaki, K., Taketani, F., Miyakawa, T., Takashima, H., Komazaki, Y., Pan, X., Kato, S., Sudo, K., Sekiya, T., Inoue, J., Sato, K., and Oshima, K.: Ozone and carbon monoxide observations over open oceans on R/V Mirai from S to N during 2012 to 2017: testing global chemical reanalysis in terms of Arctic processes, low ozone levels at low latitudes, and pollution transport, Atmos. Chem. Phys., 19, 7233-7254,, 2019.
  • [34] Miyazaki, K., Bowman, K. W., Yumimoto, K., Walker, T., and Sudo, K.: Evaluation of a multi-model, multi-constituent assimilation framework for tropospheric chemical reanalysis, Atmos. Chem. Phys. Discuss.,, in review, 2019.
  • [35] Miyazaki, K., et al.,: An updated tropospheric chemistry reanalysis for the years 2005–2017: TCR-2, Atmos. Chem. Phys., in prep.
  • [36] Rasmussen D. J., et al. (2012) Surface ozone-temperature relationships in the eastern US: A monthly climatology for evaluating chemistry-climate models. Atmos Environ 47:142–153.
  • [37] M. Reichstein, et al., Deep learning and process understanding for data-driven Earth system science, Nature 566, pages195–204 (2019)
  • [38] Reidmiller, D. R., Fiore, A. M., Jaffe, D. A., Bergmann, D., Cuvelier, C., Dentener, F. J., Duncan, B. N., Folberth, G., Gauss, M., Gong, S., Hess, P., Jonson, J. E., Keating, T., Lupu, A., Marmer, E., Park, R., Schultz, M. G., Shindell, D. T., Szopa, S., Vivanco, M. G., Wild, O., and Zuber, A.: The influence of foreign vs. North American emissions on surface ozone in the US, Atmos. Chem. Phys., 9, 5027-5042,, 2009.
  • [39] Ronneberger, O., Fischer, P., & Brox, T. 2015, arXiv e-prints , arXiv:1505.04597.
  • [40] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536.
  • [41] Rumelhart, D. E., G. E. Hinton, and R. J. Williams. 1988. Learning internal representations by error propagation. In Neurocomputing: foundations of research, James A. Anderson and Edward Rosenfeld (Eds.). MIT Press, Cambridge, MA, USA 673-695.
  • [42] L. Shen, and Loretta J. Mickley. Seasonal prediction of US summertime ozone using statistical analysis of large scale climate patterns. March 7, 2017 114 (10) 2491-2496; first published February 21, 2017.
  • [43] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c. WOO. Convolutional lstm network: A machine learning approach for precipitation nowcasting. In Advances in Neural Information Processing Systems 28. 2015.
  • [44] Silvern, R. F., Jacob, D. J., Mickley, L. J., Sulprizio, M. P., Travis, K. R., Marais, E. A., Cohen, R. C., Laughner, J. L., Choi, S., Joiner, J., and Lamsal, L. N.: Using satellite observations of tropospheric NO2 columns to infer long-term trends in US NOx emissions: the importance of accounting for the free tropospheric NO2 background, Atmos. Chem. Phys. Discuss.,, in review, 2019.
  • [45] Sutton RT, Hodson DL (2005) Atlantic Ocean forcing of North American and European summer climate. Science 309(5731):115-118.
  • [46] Sutton RT, Hodson DL (2007) Climate response to basin-scale warming and cooling of the North Atlantic Ocean. J Clim 20(5):891-907.
  • [47] Travis K. R., et al. (2016) Why do models overestimate surface ozone in the Southeast United States? Atmos Chem Phys 16(21):13561-13577.
  • [48] Travis, K. R. and Jacob, D. J. (2019), Systematic bias in evaluating chemical transport models with maximum daily 8-hour average (MDA8) surface ozone for air quality applications, Geosci. Model Dev. Discuss.,, in review, 2019.
  • [49] Zeiler, M. D., & Fergus, R. 2013, arXiv e-prints , arXiv:1311.2901.
  • [50] Zhang L., Jacob. D. J., Yue, X., Downey, N. V., Wood, D. A., and Blewitt, D., Sources contributing to background surface ozone in the US Intermountain West, Atmos. Chem. Phys., 14, 5295-5309,, 2014.
  • [51] Zhou, B., Khosla, A., Lapedriza, A., et al. 2015, arXiv e-prints , arXiv:1512.04150.