The severe impact of COVID-19 on global economic conditions cannot be understated, with UK real gross domestic product (GDP) contracting by a record 20% in the second quarter (Q2) of 2020 alone. Timely estimates of economic activity through the current turbulent period and forecasts of the impact of government virus mitigation strategies will be crucial for evaluating the costs and benefits of different policy interventions.
This article introduces new methods to forecast Purchasing Managers’ Indices (PMIs) during the COVID-19 pandemic and quantifies the potential impact of government virus mitigation strategies. PMIs are survey-based leading indicators of the direction of economic activity. We choose to model PMIs as they are used extensively for the nowcasting of slower-moving headline measures of economic activity, such as GDP. See, for example, d2012survey. Our analysis builds on the class of generalised network autoregressive (GNAR) models (knight2016; knight2019generalised), which model multivariate time series in a highly parsimonious fashion by associating the dependencies between constituent series with an underlying network structure. Hence, GNAR models can characterise a rich diversity of dependence structures, whilst incurring a lower risk of overfitting when compared to classical vector autoregression (VAR) models, translating into superior forecasting performance. Indeed, our work below demonstrates that GNAR models outperform VAR models by 33% based on mean-squared forecasting error for forecasting PMIs from historical PMI data.
A goal here is to model PMIs in terms of historical PMIs, PMIs from neighbouring countries via our network model and (i) stringency measures of government intervention policies enacted to mitigate the spread of COVID-19 and (ii) the monthly number of confirmed COVID-19 deaths per million. Incorporating information about past PMIs from neighbouring countries via a network structure improves forecasting as it is extra relevant information. Our network model uses countries’ PMIs on the network nodes (vertices) and the strength of link between two countries (nodes) increases with increasing quantities of exports between the two. Figure 1 (below) shows an example of one of the networks we use. In addition, the incorporation of multiple exogenous time-dependent node-specific regressors is both new to GNAR modelling and practically essential for properly understanding the potential influence of (i) and (ii). Our new extension is named GNARX, where X indicates incorporation of fully time-dependent exogenous variables, which is in line with similar naming conventions, such as VAR to VARX.
We structure our paper as follows. Section 2
introduces the new GNARX specification and characterises it as a specially-constrained linear regression problem. Asymptotic properties of the corresponding estimators follow from this. Section3 provides details of our multivariate time series and network data and discusses two experiments: the first compares out-of-sample forecasting performance of GNAR models (i.e. without using external regressors) with a VAR model, and the second evaluates the GNARX model estimates when the stringency indices and COVID-19 death rates are incorporated. Section 4 uses our new GNARX modelling ability to analyse UK economic conditions under three different scenarios pertaining to either tightening, easing or unchanged intervention policies over the next six months. Section 5 concludes with a brief discussion of the experimental results and avenues for future research.
2 The GNARX mathematical model and notation
Definition. Let our multivariate real-valued time series be observations collected on variable at time , where is the number of variables. We sometimes write the multivariate series as the vector time series , where denotes transpose, for . Now let be a graph consisting of a set of vertices/nodes, , and an edge set denoted by . Let denote the presence of an undirected edge between nodes and and denote the existence of a directed edge from to , for . For network time series a graph vertex corresponds to time series variable . Let be the -th real-valued stationary exogenous node-specific regressor series for node at time , for , the number of such regressors.
We introduce the GNARX model, where , by
where the parameters for all , is the autoregressive order of the model and also maximum order of neighbour time lags, (the th element of ) denotes the maximum lag of the th exogenous regressor involved in the model, and (the th element of , which we sometimes write as ) is the maximum stage of neighbour dependence for time lag . For example, implies that nodes are dependent on the first lags of its first and second stage neighbours in the network . The set is the -th stage neighbourhood set of node defined by
for , , where . The in (1) are network connection weights, typically set as the inverse of some prior notion of distance between nodes, as in knight2019generalised using a similar approach to that used in the lifting scheme described by jansen2009multiscale. Sometimes we consider the global- model where , i.e. use the same sequence for each node. Consequently, the local- model is where the sequence is different for each node .
The GNARX model cannot be captured by the standard GNAR model of knight2019generalised by incorporating the exogenous regressors into , as and may differ. The network vector autoregression model proposed by zhu2017network does incorporate dependence on node-specific covariates, but unlike model (1), they do not vary with time and have limitations on other parameters/quantities in the network. Full specifications for both of these models can be found in the Supplementary Material Section 1.
GNARX models share the same advantages with respect to missing data as GNAR models, especially when compared to vector autoregression-based approaches. This is because, e.g., VAR models try to estimate parameters that link one specific variable to another. If, for example, one of those variables disappears for a significant block of time, then it can be difficult to estimate the parameter associated with any of its pairings. By contrast, in a GNARX model, a variable pair is just a nearest neighbour and every such pair in the data contributes to the ‘nearest neighbour’ parameter, one of the . So, even if one node suffers from significant missingness, there will be usually many other pairs so that the respective parameter is still well-estimated. More details on this, particularly on how this works for th-stage neighbours, can be found in Section 2.6 of knight2019generalised.
GNARX Parsimony. The GNARX model with a single exogenous regressor () contains a total of parameters. By comparison, the corresponding vector autoregression model with exogenous variables (VARX) containing lags of the modelled series , and lags of the exogenous series , can be given in the reduced form by
which contains parameters. The parsimonious nature of GNARX compared to VARX is clear when parameter growth is for the former and for the latter. Indeed, it has been suggested by bernanke2005measuring that vector autoregressions are not often used for macroeconomic datasets that contain more than six to eight time series. The parsimony of GNARX, combined with the flexibility to model dynamic networks representing multiple covariates, ensures our methodology is readily applicable to many applications involving the forecasting of multivariate time series.
GNARX Stationarity. Under the assumption of stationarity of for all and parameter constraints
the GNARX process is stationary. We prove this in Section 2 of the Supplementary Material. For simplicity, we will assume for the remainder of this section only, replacing with and with . Extending the following results to multiple external node-specific regressors is straightforward.
Precise model definition and theoretical statistical properties. With the assumption, the GNARX model can be expressed as a vector autoregression of the form (3), but where coefficient matrices are restricted so that , , and . In matrix form, the model can be written as , where , , , , , and . Additional regressors can be incorporated by concatenating the corresponding coefficient and data matrices to and , respectively.
Following lutkepohl2005new, the coefficient restrictions on that result from the GNARX assumptions are , where is the model matrix, defined in the Supplementary Material Section 4, , with GNAR parameters that we are interested in given by and all elementary vectors are of length . With the global alpha assumption, for all , we have and .
Given the VAR matrix representation and linear coefficient restrictions from above, lutkepohl2005new proposes the feasible generalised least squares (FGLS) GNAR parameter estimator to be
where the maximum likelihood estimator of is
and where satisfies and
is the ordinary least squares estimator of. In the case of missing observations, the corresponding residuals are set to zero. Under the conditions of stationarity of and , consistency and asymptotic normality are proved in Section 2 of the Supplementary Material.
3 Purchasing Managers’ Indices: the data and forecasting experiments
3.1 Input data
3.1.1 Purchasing Managers’ Indices
IHS Markit Composite Purchasing Managers’ Indices (PMIs) are computed by aggregating the views of senior purchasing executives from around 400 companies on a wide range of economic variables pertinent to the manufacturing and services sectors. Composite PMIs can take values between zero and , with a reading above indicating economic expansion. All PMI series have been seasonally adjusted using the X-12 ARIMA method; full details of the survey methodology are available at https://ihsmarkit.com/products/pmi.html. Currently, monthly composite PMI data are available for thirteen major economies, which are all included in our analyses. The longest series are from the UK, Italy and Germany, which start from Jan-98, and the shortest is from Australia, which starts from May-16.
PMI series are diffusion indices, which treat a fixed set of component economic indicators effectively as dummy variables that represent whether the trend in the component is positive or negative. For PMIs, these component series are the survey responses of individual purchasing managers. It is therefore reasonable to assume that diffusion indices are stationary, given the well-documented cyclical nature of macroeconomic time series, seezarnowitz1991business. Moreover, Augmented Dickey-Fuller (ADF) unit root tests on the in-sample period between Jan-98 and Dec-17 result in -values above for all series, indicating stationarity for the time being, against the alternative hypothesis of first-order non-stationarity. Broadly, the series also do not reject the null of second-order stationarity according to the wavelet-based tests of Nason13; Nason:locits.
3.1.2 Network construction
GNAR and GNARX models require us to specify a network structure to associate with the PMI dataset. The time series for a particular country’s PMI corresponds to a single vertex in the network — one vertex for each country. Rather than attempting to learn the network structure indirectly from the time series as in leeming:phd or knight2019generalised, we suggest that a construction based on the flow of exports between countries that provides a reasonable reflection of the association between component series. This rests on the assumption that a country’s business confidence is partly dependent on the level of economic activity of its major export partners. Since (i) model orders are chosen by BIC optimisation, (ii) network parameters are highly significant and (iii) forecast performance is improved when compared to models that ignore network information, these all suggest that our choice of networks is reasonable. Evaluating performance by forecast accuracy is honest and exacting.
Our experiments evaluate two different networks:
Fully-connected trade network. Each country is connected to every other country. Edge weight is set to be the export quantity between countries and . The weights are then normalized so that for all . In this case, for all .
Nearest neighbour trade network. (Figure 1) Each country has a single outgoing edge to the country that it exports the most too.
Export data was obtained from the UN Comtrade International Trade Statistics Database (https://comtrade.un.org/data/), using the latest available observations.
3.1.3 Exogenous variables
The GNARX model’s advantage is that it permits the multivariate time series to depend on vertex-specific external time series regressors ; two series in our examples. The first, , are differenced country stringency indices (hale2020oxford) for country , which aim to quantify the severity of COVID-19 mitigation policies. These are composite indices that take values between zero and , with higher values reflecting stricter measures. The stringency indices are mainly related to containment policies, such as the closing of schools, workplaces and public events. The full list of constituent indicators is available at https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation. Values of the stringency indices before the COVID-19 outbreak are automatically set to zero, rather than treated as missing.
The second regressor, is the first difference of country new COVID-19 death rates, which may impact economic conditions via channels beyond the previously discussed interventions. For example, consumers may extrapolate a rise in the confirmed number of COVID-19 deaths into the future, leading to consumer precautionary saving in anticipation of greater future lockdown measures. As higher death rates would likely also be associated with stricter mitigation policies, omitting death rates would likely lead to model errors being correlated with the stringency index regressor, resulting in inconsistent parameter estimates. Our experiments use the daily number of confirmed COVID-19 deaths per million in a given country, averaged for each month (owidcoronavirus), available at https://ourworldindata.org/covid-deaths. The sample correlation between the differenced stringency indices and the differenced COVID-19 death rate of the same country is only , suggesting that multicollinearity is not a significant problem. The correlation only rises modestly to for the non-differenced series.
Figure 2 shows the UK PMI and the two exogenous variables starting from Jan 2019, where all series have been standardised to have zero mean and unit variance so that they can be compared on the same axes. The most prominent feature of the UK PMI series is the large six and eight standard deviation declines during Mar and Apr 2020.
This fall in purchasing manager confidence coincides with the first UK lockdown (rise in stringency index), but also reflects broader worsening of global macroeconomic conditions caused by the COVID-19 pandemic, see, e.g.,
shows the UK PMI and the two exogenous variables starting from Jan 2019, where all series have been standardised to have zero mean and unit variance so that they can be compared on the same axes. The most prominent feature of the UK PMI series is the large six and eight standard deviation declines during Mar and Apr 2020. This fall in purchasing manager confidence coincides with the first UK lockdown (rise in stringency index), but also reflects broader worsening of global macroeconomic conditions caused by the COVID-19 pandemic, see, e.g.,long2020world.
3.2 GNAR forecasting performance
Here, model order is selected to minimise the Bayesian Information Criterion (BIC) subject to an order limit of . For computational feasibility, we adopt a stagewise approach, where is first selected to minimise BIC, followed by for . This allows us to capture network effects lasting up to a year in duration. For the exogenous regressors in later experiments, we instead impose a lower order limit of three months, due to the lack of nonzero observations. In that case, for are optimised at the same time as . All PMI observations up to Dec-17 are treated as the in-sample period for model order selection and estimation of both GNAR and VAR models. We refer readers to the Supplementary Material Section 7 for the results if model orders were instead selected to minimise forecasting errors in a subset of the in-sample period. Finally, we provide simulation evidence for the consistency of the model selection procedure using stagewise BIC optimisation for GNARX in the Supplementary Material Section 8, which shows similar performance to ‘global search’ BIC.
Unlike GNAR, VAR models do not easily handle missing values. Specifically, GNAR models deal with missingness by modifying connection weights (knight2019generalised, Section 2.6) and because of this advantage our GNAR model can make use of the entire unbalanced panel starting from Jan-98. If Australia were included, then the in-sample period for the VAR model would contain only 20 months and, for this reason, we exclude Australia from this analysis.
Table 1 shows the composite PMI forecasting performance of four different stagewise-BIC selected GNAR models, a VAR model, a naive forecast (predicting next from current) and univariate AR models. The latter involve a separate AR model estimated for each country, with order determined by BIC optimisation. For all but three countries, AR was selected. However, for fair comparison with the GNAR and VAR models where autoregressive order is the same for every series, AR was used for all countries. It is worth noting that AR is identical to GNAR with local-. We have used mean-squared forecast error (MSFE) in our article, which computes the empirical average of the squared differences between our point forecasts and the eventual out-turn. The lowest forecast error is attained by the local- GNAR model with the fully-connected network.
Furthermore, the mean-squared forecast error of all four GNAR models are substantially lower than that of the VAR model, indicating overfitting in the latter. The local- GNAR model contains only thirteen parameters, whilst the VAR() model contains parameters. Our results indicate that when a simple model is appropriate, BIC optimisation for VAR may still lead to overfitting, but the same does not hold for GNAR models. Indeed, only three of the four GNAR models outperformed the naive forecast and one of the two local- GNAR models outperformed the univariate AR models, reflecting the difficulty in forecasting economic indicators using only past values. On the other hand, the best GNAR model does improve the forecast performance and a few percent improvement can translate into substantial economic value when one is dealing with GDP-like magnitudes, see Section 4.
Table 2 shows estimates for the local- GNAR model, with -values computed using HC2 robust standard errors, which are heteroskedasticity-consistent estimators first proposed by mackinnon1985some that adjust residuals using the projection matrix leverage values. These were calculated with the sandwich package of zeileis2004econometric. In this experiment, the use of robust standard errors produces more conservative -values.
Figure 3 plots the UK out-of-sample one-step-ahead rolling forecasts for this best-performing model, along with those from the VAR model (see Supplementary Material, Section 6, for other countries). The VAR forecasts appear to underestimate the initial PMI decline at the start of the crisis period, but substantially overestimate the recovery. This leads to substantially higher mean-squared forecast error during the COVID-19 crisis period starting from Feb-20, relative to GNAR.
Overall, BIC choice selects for higher autoregressive order in the global- GNAR models compared to the local- variant, which also results in higher out-of-sample mean-squared forecast error. This suggests that the PMIs’ autoregressive dynamics differs between countries, although the autoregressive parameters in Table 2 are similar. Moreover, BIC choice of the local- models also favours exploiting the network structure implied by export flows, whether by using the fully connected or the nearest neighbour network: for both of the best-performing GNAR models.
3.3 GNARX forecasting performance with external regressors
Given the shortness of the exogenous regressors, it is not feasible now to evaluate out-of-sample forecasting performance for the GNARX models, so we switch to in-sample forecast evaluation. For example, predicting ‘October’ based on exogenous variables that only existed since early Spring, i.e. few observations, is not sensible. Here, we estimate model parameters on the whole series and then, given a time point, we use the estimates, but use observations only from previous time points. In future, with more data, we will be able to switch to out-of-sample forecasting performance evaluation.
We can now reintroduce the shorter Australian PMI series, using the same method for handling missing observations as in Section 3.2. On our relatively short time series with missing data, vector autoregressions with exogenous variables (VARX) as described in, for example, tsay2013multivariate, which contain parameters, are infeasible. Recall that some VAR models may be estimated as separate linear regressions for each country. A simple VARX(1,0) model would involve the estimation of 13 coefficients corresponding to the stringency indices and 13 corresponding to the death rates in each country. However, our dataset contains fewer than 13 nonzero observations for each exogenous variable of each country, clearly leading to a parameter identification problem.
Table 3 shows in-sample mean-squared forecasting errors for our stagewise-BIC selected GNARX models using the fully-connected network (results for the nearest neighbour network and no network were always worse and reported in the Supplementary Material, Section 11). The inclusion of exogenous variables results in substantially lower BIC and an almost halving of the mean-squared forecast error across all settings compared to GNAR without external regressors. The lowest BIC model is the global- GNARX, with estimates shown in Table 4 and -values calculated as in Table 1.
Table 4 shows that the 1st and 5th positive autoregressive coefficients are significant. As before, current-period PMIs are also significantly positively dependent on previous-period PMIs of neighbours weighted by export volumes. On the other hand, the negative and parameter estimates are smaller in magnitude.
The zero-lag coefficient for COVID-19 intervention stringency indices is highly significant, matching our expectation of a strong coincident negative impact of containment policies on business confidence. A country’s COVID-19 death rate also has a significant negative contemporaneous impact on the PMI, indicating that the pandemic might be influencing business confidence through channels beyond policy interventions.
Figure 4 shows the global- GNARX UK fits over the most recent five-year period, to illustrate the quality of fit during the COVID-19 period (other countries’ charts are given in the Supplementary Material, Section 10). The inclusion of stringency indices and COVID-19 death rates allows the GNARX model to anticipate both the initial composite PMI decline at the start of the crisis and the subsequent recovery, leading to a much lower mean-squared forecasting error than the corresponding GNAR model.
4 UK scenario analysis
We now use the fitted the global- GNARX model from the previous section to produce six-month forecasts of the UK composite PMI series under different assumptions on the evolution of the UK stringency index. For other examples of conditional forecasting on multivariate economic time series, we refer readers to the seminal paper by waggoner1999conditional and the more recent summary provided by banbura2015conditional, which describe conditional forecasting techniques and applications for VAR models. We would note that these describe more difficult problem than ours, as we condition on exogenous variables. Figure 5 provides forecasts of the UK composite PMI under three different policy intervention scenarios for the six month period from Nov-20 to Apr-21, assuming no change in the stringency indices of other countries from Aug-20 or in COVID-19 death rates:
Easing COVID-19 intervention measures. The stringency index decreases linearly from the Oct-20 value of to between Nov-20 and Apr-21.
No change in COVID-19 intervention measures. The stringency index is constant at the Oct-20 value of between Nov-20 and Apr-21.
Tightening in COVID-19 intervention measures. The stringency index increases linearly from the Oct-20 value of to between Nov-20 and Apr-21.
As might be expected, the GNARX model predicts significantly higher UK composite PMI under assumption of the UK stringency index falling to zero, holding death rates constant. Using bootstrap prediction intervals, computed using the algorithm introduced in the Supplementary Material Section 12, the expected PMI under the easing scenario exceeds the th percentile of the PMI under the tightening and constant stringency scenarios after only one month. The Apr-21 prediction of PMI is in the easing scenario (which would be a record high) and
in the tightening scenario. In all cases, the estimated sampling distribution of the PMI is negatively skewed.
We now estimate how these composite PMI paths would translate into headline quarter-on-quarter real GDP growth, using data sourced from https://www.ons.gov.uk/economy/grossdomesticproductgdp/timeseries/ihyq/qna. At the time of writing, 2020 Q3 is the latest available observation of UK GDP. Given that the GDP series is sampled quarterly, while the PMI series is sampled monthly, we assume a mixed-frequency linear relationship between the GDP growth and PMI series, using the MIDAS regression model (ghysels2004midas). Analysis of both the unrestricted MIDAS model and the restricted MIDAS model, using exponential Almon lag polynomial weights, suggest that quarterly GDP is primarily dependent on the “” lag of composite PMI. E.g., 2020 Q3 GDP depends only on the Jul-20 PMI observation in our simplified model.
Table 5 shows conditional forecasts for 2020 Q4, 2021 Q1 and 2021 Q2 UK GDP growth using the PMI predictions produced by GNARX for each of the three COVID-19 intervention scenarios, assuming only the lag of composite PMI is relevant. Our results suggest a % difference in 2021 Q1 GDP growth and a % difference in 2021 Q2 GDP growth between the easing and tightening scenarios.
Table 5’s bootstrap prediction intervals take into account both the uncertainties in the GDP regression model and the underlying PMI forecasts.
Lending support to the strongly positive GDP growth path in our easing restriction scenario, the most recent (at the time of writing) monetary Monetary Policy Report forecasts a material recovery in UK GDP in the event of an easing in Covid-related restrictions. The authors suggest that the economic impact of such easing would be driven by a substantial increase in consumer spending, and a rise in business investment to a lesser extent as sales rise and uncertainty diminishes.
We showed how to apply network time series models to multivariate PMI time series for many countries and how forecast performance could be improved by incorporating network information. We introduced the GNARX model, which permits exogenous variables to be incorporated into GNAR models for the first time and improves forecasting results further still by using COVID-19 intervention stringency indices and COVID-19 death rates. Parameter estimates suggest a highly significant negative impact of harsher intervention measures and COVID-19 deaths on composite PMI and suggests that this effect can be transmitted to neighbours through the export channel.
GNAR(X) model order choice using BIC selects highly parsimonious models that result in excellent forecasting performance, considering the inherent difficulties in forecasting economic indicators. On the other hand, the BIC-chosen VAR model appears to overfit, leading to relatively high forecast errors. Future work might compare the GNAR forecasting performance to that of BVAR models (giannone2015prior).
Our network was constructed by considering what looked ‘sensible’ rather than optimality with respect to forecasting performance and further development work is required. Some preliminary ideas have been suggested relating to network construction in different fields, see leeming:phd; knight2019generalised and references therein.
Our three different scenarios earlier all operated under the simplifying assumption that policy interventions would be modified independently of the COVID-19 death rate. Future work may relax this unrealistic assumption, and incorporate other variables to capture attempts by governments to mitigate the negative impact of containment measures through fiscal stimulus.
We have used mean-squared forecasting error throughout our work as it is a valid measure, well-understood and meaningful. There are, of course, many other forecast quality metrics that could be used. For example, the actual proportion of forecasts whose out-turn performed well with respect to (or within) previously established forecasting intervals, see ehm16 for more examples and discussion.
Network time series modelling is young and much remains unexplored. For GNARX models, an obvious extension might permit exogenous regressor coefficients to vary by node. Regarding the COVID-19 experiments discussed above, this would allow economies to react differently to a given stringency level of policy interventions. Intuitively, economies where a larger proportion of employees are able to work remotely should be less vulnerable to strict lockdown measures (papanikolaou2020working).
We are also conducting research where more than one network time series interacts. So, for example, one can already see suggestions in the media on how countries’ lockdown policies influence each other. For example, the second UK lockdown on 5th November was made politically easier because of recent lockdowns in major European neighbours. Hence, the stringency index time series (and the COVID-19 deaths) could also be represented by network time series and there seems to be no reason why the PMI and stringency index networks would have to be the same. Finally, further extensions to network time series might be considered, such as time-varying parameters that can be used to create a locally stationary network time series process, or permitting edge-distances to change and be modelled (e.g. as exports vary between countries over time).
This article was prepared under the auspices of the Imperial College COVID-19 Response Team (Economics Team). We are grateful to Dr Katharina Hauck, Deputy Director of the Abdul Latif Jameel Institute for Disease and Emergency Analytics (J-IDEA), School of Public Health, Imperial College, London for enthusiastic encouragement and helpful comments on an earlier version of this article.