### TROPOMI_LHM

None

view repo

Methane is a strong greenhouse gas, with a higher radiative forcing per unit mass and shorter atmospheric lifetime than carbon dioxide. The remote sensing of methane in regions of industrial activity is a key step toward the accurate monitoring of emissions that drive climate change. Whilst the TROPOspheric Monitoring Instrument (TROPOMI) on board the Sentinal-5P satellite is capable of providing daily global measurement of methane columns, data are often compromised by cloud cover. Here, we develop a statistical model which uses nitrogen dioxide concentration data from TROPOMI to accurately predict values of methane columns, expanding the average daily spatial coverage of observations of the Permian Basin from 16 addition of predicted methane abundances at locations where direct observations are not available will support inversion methods for estimating methane emission rates at shorter timescales than is currently possible.

READ FULL TEXT VIEW PDFMethane is a strong greenhouse gas, with a higher radiative forcing per unit mass and shorter atmospheric lifetime than carbon dioxide. The remote sensing of methane in regions of industrial activity is a key step toward the accurate monitoring of emissions that drive climate change. Whilst the TROPOspheric Monitoring Instrument (TROPOMI) on board the Sentinal-5P satellite is capable of providing daily global measurement of methane columns, data are often compromised by cloud cover. Here, we develop a statistical model which uses nitrogen dioxide concentration data from TROPOMI to accurately predict values of methane columns, expanding the average daily spatial coverage of observations of the Permian Basin from 16% to 88% in the year 2019. The addition of predicted methane abundances at locations where direct observations are not available will support inversion methods for estimating methane emission rates at shorter timescales than is currently possible.

Methane and carbon dioxide are the two dominant anthropogenic greenhouse gasses responsible for warming Earth’s atmosphere above its pre-industrial era temperature [IPCC2013]. The current average atmospheric concentrations of these gasses are 1,880 parts per billion by volume (ppbv) for methane [NOAA] (an increase of over 1,000 ppbv in the past 250 years as a result of human activity [IPCC2021]) and 413 parts per million by volume (ppmv) for carbon dioxide [NOAA_CO2]; the mass of carbon dioxide in the atmosphere now is roughly 600 times the mass of methane. However, methane is a dramatically stronger absorber of thermal radiation than carbon dioxide, and despite its much lower concentration is responsible for trapping about 58% of the heat that carbon dioxide does when the effect of methane’s inevitable decay products are also included [Balcombe2018].

Satellite-borne remote sensing and monitoring of greenhouse gas emissions is playing an increasingly important role in assessing mankind’s impact on the climate [Jacob2016, Palmer2020], as direct measurements can displace or complement inventory estimates [OGMP2]. Measurements of methane column concentrations from space began in 2003 with the SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY) on board ENVISAT, an ESA mission which terminated in 2012 [Bovensmann1999]. SCIAMACHY was succeeded by the Greenhouse Gases Observing Satellite (GOSAT, 2009-present) and GOSAT2 (2018-present) [Kuze2009, Glumb2014], operated by JAXA. Both GOSAT and GOSAT2 have improved capabilities with pixel resolutions of 10x10 km and global coverage in 3 days when compared to SCIAMACHY, which required 6 days for global coverage with a ground pixel resolution of 30x60 km. In 2016 the private enterprise GHGSat-D instrument was launched, with a greatly improved pixel resolution over GOSAT of 50x50 m for targeted viewing of selected methane sources [Jacob2016]. Next generation instruments such as the Advanced Hyperspectral Imager (AHSI) launched on board China’s GaoFen5 satellite in 2018 provide methane retrievals with a pixel resolution down to 30 metres [AHSI2019], but such missions are sporadically operated for targeted areas and do not provide the same level of spatial coverage as GOSAT [Itziar2021]. Leading the field in regional observation is the TROPOspheric Monitoring Instrument (TROPOMI) on board the ESA’s Sentinal-5 Precursor satellite. Launched in 2017, TROPOMI delivers daily global coverage, observing full methane columns with a ground pixel resolution of 7x7 km [Veefkind2012, Butz2011]. With early data successfully compared with GOSAT [Hu2018], TROPOMI is the cornerstone of the ESA’s commitment to monitoring national pledges towards emission reductions under the Paris Climate Accord [Clery2019].

Remote sensing of methane emissions using satellites is attractive because it can be frequent, recurring and non-intrusive to operations on the ground. Satellites therefore have the potential to detect intermittent methane sources or emissions released during abnormal operating conditions [Alvarez2018]. Observations from SCIAMACHY [Schneising2014] and GOSAT [Turner2016] have successfully provided top-down estimates of regional methane emission rates. More recently, observations from TROPOMI have provided top-down estimates of methane emission rates from the Permian Basin (shown in Fig. 1, extending from western Texas to southeastern New Mexico) of 2.7 teragrams per year over a 12-month period from March 2018 to March 2019 [Zhang2020].

Although the TROPOMI instrument covers the entire surface of the Earth at least once a day, the actual amount of methane concentration data from TROPOMI is limited by a variety of factors [Veefkind2012]. Cloud cover and other meteorological conditions often hamper the retrieval of methane column data, and so does the presence of aerosols in the atmosphere [CH4_ATBD]. Consequently, sparse data over regions of interest must be averaged over months or years to obtain suitable coverage for emission budgets [Schneising2014, Turner2016, Zhang2020]. In contrast, TROPOMI observations of nitrogen dioxide are not subject to the same difficulties posed by cloud masks [NO2_ATBD]. Large excesses of nitrogen dioxide are often linked to regions of rapid urban expansion and industrial activity [Duncan2016].

Oil and gas production is often associated with nitrogen dioxide emissions as well. For example, lit methane flare stacks emit combusted natural gas which contains nitrogen dioxide [Elvidge2018, Elvidge2016]; it is also known that the combustion efficiency in flares is not equal to 100%, leading to the co-emission of methane into the atmosphere [Zhang2020, Gouw2020]. Similarly, gas-fuelled compressors emit nitrogen dioxide during operation but they may also emit methane leaking through their seals. A final example are the pumps and storage tanks which are an integral part of oil distribution networks: pumps can emit nitrogen dioxide as a result of fuel combustion, whereas methane can leak from thief hatches in storage tanks. Indeed, a recent study [Gouw2020] has shown that there is a correlation between the nitrogen dioxide and methane concentrations measured by TROPOMI over the Permian Basin.

In this work, we compensate the missing direct methane data from TROPOMI by using nitrogen dioxide as a proxy or predictor of methane column density with the proportionality inferred from sample locations where confident measurements exist for both species [Gouw2020]. We develop a Bayesian model to infer missing methane column data based on co-located column values for nitrogen dioxide, which expands the spatial coverage of methane observations from TROPOMI. We use the Permian Basin as a case study; as the most productive basin in the United States, it produces more than 18,000 million cubic feet of natural gas and nearly 5 million barrels of oil per day [Permian2021]. We fit our model to TROPOMI observations, test its predictive ability and investigate how the inclusion of such estimates expands the daily spatial coverage of methane data. We examine the implications of including inferred values in emission budgets by calculating the above-background mass of methane observed in the Permian Basin over the course of the year 2019, and find the inclusion of inferred methane values results in kilotonnes more of excess methane observed per day. The spatial augmentation of TROPOMI methane observations will support inverse methods for estimating methane emission rates on shorter timescales than currently possible, which will be invaluable as policymakers begin to require recent and up-to-date methane emission estimates for industrial regions.

We fit our model to a year’s observations over the Permian Basin from 2019, using co-located observations contained within a marked study region shown in Fig. 1. TROPOMI provides one observation per day per location. Our model is a linear hierarchical model where the relationship between observations of nitrogen dioxide and methane on a given day are related to one another by a linear form with slope parameter and intercept parameter , plus another daily parameter to account for intrinsic scatter around the linear trend. The daily parameter has units of ppbv and roughly corresponds to the abundance of methane in the study region on day that is not associated with lit, burning flare stacks. The daily parameter has units of ppbv / (mmol m) as we scale all observations of nitrogen dioxide columns into mmol m.

We specify a multivariate normal hierarchical prior on values of and such that

(1) |

where , and

. The hyperparameter

controls the degree of correlation between and , with . The hyperparameter roughly corresponds to the average background level of methane in the study region across the year 2019, and the hyperparameter is the deviation that controls the spread of around , with , and similarly for and .We fit two models, each to a specific subset of data. First, we fit a “data-rich” model to observations from days that produce numerous highly correlated co-located observations of methane and nitrogen dioxide in the study region. Afterwards, we fit a “data-poor” model to observations from all other days where there are at least two co-located observations of methane and nitrogen dioxide in the study region. The difference between the two models is that in the data-rich model, we stipulate a uniform prior independently for each of the hyperparameters in order for the model to learn their posteriors predominantly from the data, and in the data-poor model we provide the information learned from the data-rich model to the hyperparameters as prior information. We do this so that when the data-poor model is being fit either to sparse or less-correlated observations we are making full use of the information that can be obtained from data-rich days. When fitting all models, we monitor the effective sample size (ESS) and of all parameters in addition to the energy Bayesian fraction of missing information (E-BFMI) to ensure convergence and efficient sampling [Betancourt2016, Vehtari2021]. The data-rich model was fit with an ESS of 212.9 and the data-poor model was fit with an ESS of 860.0.

We examine the joint posterior distribution of the hyperparameters of the data-rich model, and represent median values of , , , and in panel b of Fig. 2. We find that the correlation between and

on data-rich days for the year 2019 has a median value of -0.18 with a 68% central credible interval of (-0.30, -0.05). We would expect a value of

due to the positive correlation between the amount of methane and nitrogen introduced to the atmosphere from flare stack combustion chemistry [Sonibare2004, Deetz2017, Ismail2016, Umukoro2017], i.e., positive correlation in the data implies negative correlation between the slope and intercept. However, it is important to note that this result was obtained without stipulating any prior constraint thatmust be negative. Multivariate normal hierarchical prior specification and uniform hyperpriors for hyperparameters in the data-rich model allows us to learn the degree to which the daily parameters are correlated, and leverage this information as a prior of appropriate strength in the data-poor model.

Other model hyperparameters that we discuss at this stage are and . These two model parameters have physical meaning, where can be thought of as the ‘mean’ background of for the year 2019, and as the extent to which the daily intercept varies around from day to day. We find to have a median value of 1861.32 ppbv with a 68% central credible interval of (1859.52, 1863.15), and to have a median value of 14.44 ppbv with a 68% central credible interval of (13.25, 15.73). For the year 2019, the globally-averaged monthly mean value of marine surface methane ranges from 1858 to 1875 ppbv [NOAA], which reinforces the physical interpretability of our model.

To assess the predictive ability of our models, we performed hold-out testing by refitting our models to a random selection of 80% of observations in the study region on each day, using the remaining 20% of TROPOMI methane observations for comparison against model predictions at those locations. Both the data-rich and the data-poor model fit to the 80% datasets with satisfactory E-BFMI, ESS and . After generating predictions from the fitted model and comparing to co-located withheld observations across all days, we calculate values of reduced chi-squared

for each day (i.e., chi-squared per degree of freedom), the resulting distribution of which is shown in panel

b of Fig. 3. We also calculate residuals between model predictions and withheld observations across all days, which have a mean of 0.15 ppbv and standard deviation 12.09 ppbv, correlated with Pearson

(shown in panel a of Fig. 3). For comparison, first results from TROPOMI were initially tested against co-located GOSAT observations with a mean difference of 13.6 ppbv and standard deviation 19.6 ppbv, correlated with Pearson [Hu2018]. Panel a of Fig. 3 also demonstrates the tendency of the model to underestimate methane from high values of observed nitrogen dioxide and overestimate methane at low values of nitrogen dioxide. This is a result of the regression dilution caused by uncertain TROPOMI observations of nitrogen dioxide [Hutcheon2010, Andreon2013], seen in Panel a of Fig. 2.After re-fitting the data-rich model to 100% of available observations we examine posterior estimates of daily parameters as time series. We plot median values of the estimate for the slope parameter in Fig. 4 along with a time series of active flare stacks in the study region on data-rich days, identified from VIIRS Nightfire [Elvidge2013]. We do not find any correlation between estimated and the total number of identified flare stacks inside the study region on a given day. If the entire extent of the study region were dominated by identical lit flare stacks burning identical natural gas, then in atmospherically still conditions we would expect that the estimated value of would be close to the methane-nitrogen dioxide ratio expected from the balance of combustion reactions [Sonibare2004, Ismail2016, Deetz2017, Umukoro2017]. Conversely, we would expect an estimated value of of zero in a region void of sources of anthropogenic emission. The fact that we do not see a positive correlation between and lit flare stack count suggests that there are sources of methane in or around the study region that are not contributing nitrogen dioxide, such as well head completions, leaks, compressor stations or unlit flare stacks. Some flare stacks may be operating in an abnormal manner [Rella2015, Omara2016, Robertson2017], and recent work suggests that super-emitting individual flares may be accountable for the majority of methane emissions in the Permian Basin [Itziar2021]. In addition to methane flaring, in recent years drilling activity has been found to be an equally important producer of nitric oxide and nitrogen dioxide in the Permian [Dix_2020], and so it could be the case that may more strongly correlate with combined production statistics that include activity beyond flaring. Further work would need to be done to understand the extent to which depends on, or correlates with, factors like production data, the price of oil or seasonality, and would require an investigation of a longer time series which is beyond the scope of this paper.

We augment daily observations over the study region by including model predictions at locations where direct observations are not available from TROPOMI. Predictions are only made from co-located TROPOMI observations of nitrogen dioxide with an associated quality assurance value greater than or equal to 0.75. An example of this procedure is shown in Fig. 5. Prior to augmentation using predictions from the model, we calculate that the mean value of daily pixel coverage within the study region for the year 2019 is . After adding in model predictions at appropriate pixel locations, the mean value of daily pixel coverage rises to . A time series demonstrating this rise in spatial coverage is shown in panel a of Fig. 7. Increasing the spatial coverage over regions of interest like the Permain Basin can allow for the aggregation of data on shorter timescales than previously used in performing methane emission estimates, which will be key for accurately monitoring greenhouse gasses.

The cities of Odessa and Midland are contained within the study region and have populations that exceed 100,000 people. Cities are known to emit large quantities of nitrogen dioxide [Duncan2016], and may emit co-located methane at a rate that differs from the rural surrounding oil and gas producing regions. To examine the extent to which including these cities in the study region may affect predictions of methane from co-located nitrogen dioxide elsewhere in the study region, we define a new sub-region within the study region that contains the two cities. For the year 2019 we bin observations of methane and nitrogen dioxide from the study region into comparative histograms (shown in Fig. 6), separating observations over the cities from observations over the surrounding rural areas. We also include in this figure a distribution of predictions of methane columns made over the city region from the final fitted model. We find that when compared using a two-sample Kolmogorov-Smirnov test, distributions of urban and rural observations of nitrogen dioxide differ significantly with a p-value less than 0.001. Similarly, distributions of urban and rural observations of methane differ significantly with a p-value less than 0.001, and lastly, distributions of observations and predictions of methane in the urban region differ significantly with a p-value less than 0.001. Though the results of the holdout model fitting suggest satisfactory performance when comparing model predictions to observations, future work would need to determine the extent to which urban areas might impact predictions of methane columns in nearby industrial regions.

By subtracting a nominal background level of methane from TROPOMI observations and model predictions, we can calculate the above-background mass of methane contained above the study region on a given day. We use the global monthly marine mean surface value of methane as a far-field reference background [NOAA].

We convert all methane columns within the study region on each day to above-background masses and integrate over the spatial extent to calculate the above-background mass. This involves converting all values of methane (which either returned from TROPOMI or predicted from the model as dry-air column averaged mixing ratios) into column densities, for which we need a grid of dry-air column densities at all pixel locations. We calculate a grid of dry air column density to convert both TROPOMI observations and model predictions to masses in a consistent manner [ERA5], since dry air column densities are only provided with the TROPOMI Level 2 CH Data Product at the location of observed pixels [CH4_ATBD]. Values of dry air column density from our grid are correlated with values returned with the TROPOMI Level 2 Data Product with a Pearson R of 0.97, and have a mean residual of -1,242.48 kg m with standard deviation 1,836.32 kg m. We use the root mean square deviation of 2,217.16 kg m for error propagation in the calculation of mass of methane contained with a pixel in the study region.

We find that over the course of the year 2019 the average daily above-background mass is kilotonnes. We re-calculate this quantity including predictions from the model and find a new daily mean of above-background mass of kilotonnes. These two quantities are plotted in panel c of Fig. 7. The choice of a reference background from the far field does allow for negative values of above-background methane levels, but the point of the calculation is to demonstrate the affect of including predictions, and so the choice of background isn’t strictly important. Previous work has demonstrated that time-integrated TROPOMI observations of methane can be used to create top-down estimates of methane emission rates [Zhang2020]. While we do not currently carry out any inverse analysis, we demonstrate that the inclusion of predictions of methane from co-located observations of nitrogen dioxide increases the above-background value of methane loading in the study region by kilotonnes per day on average, which could indicate that the inclusion of predictions reveals methane hotspots that were previously unobserved. By increasing the spatial coverage of observations, we can in future work attempt to increase the temporal resolution of methane emission rates in the Permian.

The remote sensing of atmospheric methane from space is crucial for mitigating anthropogenic climate change. Using the Permian Basin as a case study, we show that observations of nitrogen dioxide can be used as predictors for methane at locations for which meteorological factors prevent TROPOMI from making direct observations of methane from space. We validate our Bayesian model using a variety of metrics and examine its predictive ability against withheld observations. The lack of correlation between the daily slope parameter of our model and the number of active flare stacks in the study region on each day suggests that there are sources contributing methane without associated nitrogen dioxide, which could be indicative of the presence of unlit or malfunctioning flares. When using predicted values of methane observations to augment daily observations over the Permian, we find that spatial coverage is increased to nearly full coverage on most days, and the observed mass of potential methane hotspots is increased on average by kilotonnes per day. Using a similar algorithm to the work shown here could aid in the inversion of satellite observations and estimation of methane emission rates on timescales much shorter than previously achieved.

The data that support this study are available from the Copernicus Open Access Hub (for TROPOMI and Level 2 Products), the Copernicus Climate Change Service (for ERA5 reanalysis data), the Earth Observation Group at the Colorado School of Mines (for VIIRS Nightfire) and the Global Monitoring Laboratory at the National Oceanic and Atmospheric Association (for mean global marine surface methane).

All code used in this study to analyse data and generate figures can be found at the corresponding author’s Github repository titled TROPOMI_LHM (“Linear Hierarchical Model”).

C.R. acknowledges financial support from Shell Research Ltd. through the Cambridge Centre for Doctoral Training in Data Intensive Science.

C.R. retrieved all data, wrote all code and conducted the analysis of this work. O.S. supervised the project and guided the analysis. K.M., P.J. and M.J. guided the construction of the Bayesian model. R.I. and B.H. contributed discussions of the data, satellites and industrial context.

We report no competing interests.

For each day in 2019 we retrieved TROPOMI observations of methane and nitrogen dioxide over the study region as shown in Fig. 1. Before conducting any analysis, we reduce the data by ignoring pixels that do not pass the recommended quality assurance (qa) value. For the TROPOMI Level 2 Data Product, the threshold qa factor for recommended usage is 0.5 or greater, and for the TROPOMI Level 2

Data Product, the threshold qa factor for recommended usage is 0.75 or greater. We then classified individual days as being either rich or poor in remaining data, with data-rich days being those with

co-located observations of methane and nitrogen dioxide in the study region correlated with Pearson’s , and data-poor days being all other days with. Co-located observations are determined by linearly interpolating the grid of

onto the grid of . We leave the TROPOMI Level 2 Data Product in units of ppbv, but we scale the TROPOMI Level 2 Data Product from mol m to mmol m.We next developed a fully Bayesian linear hierarchical model to fit solely to observations from data-rich days. Data-rich days are indexed via , and co-located observations of methane and nitrogen dioxide within the study region on day are indexed via .

We relate observed values of methane and nitrogen dioxide to their true latent values via

(2) | ||||

(3) |

where and are the TROPOMI-provided measurement standard deviations on and respectively. We also relate latent values of methane to latent values of nitrogen dioxide via

(4) |

where we have now introduced the daily model parameters , and . On a given day , and are respectively the y-intercept and slope of the line of best fit relating methane to nitrogen dioxide, while is the standard deviation of the scatter around the mean relation.

We stipulate an improper flat prior on latent values of nitrogen dioxide observations in order to combine equations (2), (3) and (4) into the single model equation

(5) |

Writing this equation in this way is desirable because it relates the observed pixel value of methane to the observed pixel value of nitrogen dioxide entirely in terms of the associated pixel precisions and the by-day model parameters , and .

We add hyperparameters to our model by including a multivariate prior distribution for and , shown in equation (1). We include this multivariate prior to allow for the possibilty of learning the extent to which and are correlated. Although we believe it is likely that and are negatively correlated, we do not encode this belief in the prior. We instead assume a uniform flat prior on the domain on each of , , and , and assume a uniform flat prior on the domain on both and . This lets us learn the extent to which and are correlated entirely from the posterior inference. Equations (5) and (1) are the likelihood and prior of our data-rich model respectively. We write our model in Stan [Stan] via the interface CmdStanPy [CmdStanPy]

and sample the posterior of our model using Stan’s default Markov Chain Monte Carlo algorithm NUTS (the No U-Turn Sampler, which is a variant of Hamiltonian Monte Carlo). When fitting our model we specify the algorithm to draw samples from the posterior using four separate Markov chains, each with 500 burn-in iterations with a further 1000 retained draws, combining for a total of 4000 draws from the posterior for each of our model parameters. Specifying 1000 post burn-in draws per chain easily allows for a sufficient ESS.

We developed a separate Bayesian model to fit to days that we identified as being poor in data. Data-poor days were classified as those that were not data-rich and have co-located observations of methane and nitrogen dioxide in the study region. The point of developing a second model to fit to data-poor days after we have already fit a model to data-rich days is to incorporate information learned from the data-rich model into the data-poor model as an informed prior.

As in the data-rich model, the likelihood of the data-poor model is given by equation (5), and the multivariate prior on values of and is given by equation (1). However, when fitting the hierarchical model to all data-rich days simultaneously, we’ve learned information about what sort of values the model hyperparameters “should” take. To account for this information we’ve learned, we add a prior on the hyperparameters with

(6) |

We have

as the mean vector of the 4000 posterior draws of (

, , , , ) from fitting the data-rich model to data-rich days. is the 5x5 covariance matrix of , , , and , constructed using their 4000 posterior draws from the data-rich model. We fit the data-poor model to all data-poor days again using Stan and the NUTS MCMC algorithm.When fitting the data-rich and data-poor models to actual observations, the MCMC sampler was confronted with an abundance of divergent transitions. To remove these transitions, we reparameterise our models into effectively equivalent mathematical representations that present a simpler posterior geometry [Omiros2007, Stan]. We do so making use of the Cholesky decompositions of our covariance matrices [Betancourt2013].

In order to decrease fitting time of the data-rich model, we replace the per-pixel precisions returned with the TROPOMI data products with daily averages, defined by

(7) | ||||

(8) |

This yields a new likelihood equation given by

(9) |

In order to determine how this affects the predictive accuracy of the data-rich model, we fit a data-rich model using per-pixel precisions and a data-rich model using daily averages of precision to data-rich days for the month of January 2019, and estimate the difference between the two models’ expected log pointwise predictive density (ELPD) using the widely available information criterion (WAIC) [Watanabe2010, Vehtari2017]. We estimate that the difference in ELPD of the two models in this case is

. As the difference in ELPD is within a standard error of zero we continue using daily averages of TROPOMI pixel precision in order to decrease the fitting time of our data-rich and data-poor models.

We can predict and precision from a fitted model using an observation of nitrogen dioxide and its associated precision as input. To obtain predictions, we first sample potential values of from the model values via where the subscript denotes a random draw with replacement of a set of parameter values from one of the 4000 sets in the posterior chain. We then have and .

We perform dropout testing in order test the predictive ability of our model. For each data-rich day we ignore a random selection of 20% of the co-located observations of methane and nitrogen dioxide and fit the model to the remaining 80% of the data. We use Stan to encode our model and sample the posterior using NUTS, with four independent chains each with 500 burn-in draws and 1000 retained sampled draws for a total of 4000 draws from the posterior.

After the data-rich model has been fit to the retained 80% of the data, we can compare observed values of methane from the held-out subset of data to predictions from the model and summarise how well the model has fit each day using a reduced chi-squared statistic. The reduced chi-squared statistic on day is calculated via

(10) | ||||

(11) | ||||

(12) |

where again is the number of co-located observations of methane and nitrogen dioxide in the study region on day . After examining the resulting distribution of calculated values of , we fit the model to the entire set of observations in the year 2019, without withholding any subset of the data. This fitted model is the model that we use for predicting values of methane for our final results.

We next retrieved VIIRS Nightfire observations of lit methane flare stacks for each day in 2019. Nightfire datasets provide the location coordinates, estimated temperature and source size of identified flares [Elvidge2013]. We reduce the daily datasets down to flare stacks identified within the study region regardless of estimated temperature. Daily tallies of lit flare stacks were collected for eventual comparison against time series of daily fitted model parameters.

The TROPOMI Level 2 Data Product provides the dry-air column average mixing ratio of methane in units of ppbv, defined as the ratio between the column density of methane to the column density of dry air at each pixel location [CH4_ATBD]. To convert the dry-air column average mixing ratio of methane back to a column density, we simply multiply by the value of the dry air column density at the pixel location, which we calculate ourselves from ERA5 data. Although values of dry air column density are supplied at TROPOMI pixel locations where a retrieval was successfully performed, we calculate our own grid from ERA5 data in order to avoid any discrepancy between values of dry air column density at the locations of “original” observed methane pixel values and the locations of model predictions.

We began by retrieving hourly spatial grids of surface pressure and vertical column of water vapour that cover the entire extent of the study region for each day in 2019 [ERA5]. is given in [Pa] and given in [kg m]. From the value we can calculate the total column density of air via where m s is gravity. We then calculate the total vertical column of dry air via .

When calculating at the location of either an observation or model prediction of methane, we interpolate the hourly spatial grids of ERA5 data to the grid of TROPOMI methane observations, and then linearly interpolate in time between the two adjacent hourly grids according to the pixel scanline [CH4_ATBD].

We were interested in observing how the application of our predictive algorithm altered the amount of observed methane over the Permian Basin for the year 2019. By virtue of observing more spatial area, the algorithm would result in an increase of total observed mass of methane To counteract this effect, we subtract a nominal background level of methane from each pixel and then describe how the above-background mass of methane in the study region changes after the application of our predictive algorithm. We linearly interpolate the global monthly marine mean surface value of methane as a far-field reference background, provided by the Global Monitoring Laboratory (GML) at the National Oceanic and Atmospheric Association (NOAA) [NOAA].

The number of moles of methane above the GML/NOAA background contained in a pixel is calculated via where is the pixel area in square metres, B is the mean global surface value of methane in ppbv, and is either the TROPOMI-observed or model-predicted pixel value of in ppbv. We then convert from moles to tonnes by multiplying by the molar mass of methane and scaling into tonnes. The precision on the moles of above-background methane contained in a pixel is calculated via where is the standard deviation uncertainty on the value of and is the standard deviation uncertainty on the value of B. We take to be the root mean squared error on the residuals between our ERA5-calculated value of and the value returned with the TROPOMI Level 2 Data Product at all pixel locations possible on all data-rich days in the year 2019.

We calculate the total methane mass above background on a given day in the study region via where is the number of pixels in the study region where we have either or . In order to obtain a conservative estimate, we only include predicted values of methane in the summation that are predicted from values of with . The precision of total mol CH is given by .

Comments

There are no comments yet.