1 Introduction
Timevarying parameter vector autoregressions (TVPVARs) developed by Cogley and Sargent (2001, 2005) and Primiceri (2005) have become the workhorse models in empirical macroeconomics. These models are flexible and can capture many different forms of structural instabilities and the evolving nonlinear relationships between the dependent variables. Moreover, they often forecast substantially better than their homoskedastic or constantcoefficient counterparts, as shown in papers such as Clark (2011), D’Agostino, Gambetti, and Giannone (2013), Koop and Korobilis (2013), Clark and Ravazzolo (2015) and Cross and Poon (2016). In empirical work, however, their applications are mostly limited to modeling small systems involving only a few variables because of the computational burden and overparameterization concerns.
On the other hand, large VARs that use richer information have become increasingly popular due to their better forecast performance and more sensible impulseresponse analysis, as demonstrated in the influential paper by Banbura, Giannone, and Reichlin (2010). There is now a rapidly expanding literature that uses large VARs for forecasting and structural analysis. Prominent examples include Carriero, Kapetanios, and Marcellino (2009), Koop (2013), Banbura, Giannone, Modugno, and Reichlin (2013), Carriero, Clark, and Marcellino (2015), Ellahie and Ricco (2017) and Morley and Wong (2019). Since there is a large body of empirical evidence that demonstrates the importance of accommodating timevarying structures in small systems, there has been much interest in recent years to build TVPVARs for large datasets. While there are a few proposals to build large constantcoefficient VARs with stochastic volatility (see, e.g., Carriero, Clark, and Marcellino, 2016, 2019; Kastner and Huber, 2018; Chan, 2020, 2021), the literature on large VARs with timevarying coefficients remains relatively scarce.
We propose a class of models we call hybrid TVPVARs—VARs in which some equations have timevarying coefficients, whereas the coefficients are constant in others. More precisely, we develop an efficient Bayesian shrinkage and sparsification method that automatically decides, for each equation, (i) whether the VAR coefficients are constant or timevarying, and (ii) whether the parameters of the contemporaneous relations among variables are constant or timevarying. Given the importance of timevarying volatility, all equations feature stochastic volatility. Our framework nests many popular VARs as special cases, ranging from a constantcoefficient VAR with stochastic volatility on one end of the spectrum to the flexible but highly parameterized TVPVARs of Cogley and Sargent (2005) and Primiceri (2005) on the other end. More importantly, our framework also includes many hybrid TVPVARs in between the extremes, allowing for a more nuanced modeling approach of the timevarying structures.
To formulate these large hybrid TVPVARs, we use a reparameterization of the standard TVPVAR in Primiceri (2005). Specifically, we rewrite the TVPVAR in the structural form in which the timevarying error covariance matrices are diagonal. Hence, we can treat the structural TVPVAR as a system of
unrelated TVP regressions and estimate them one by one. This reduces the dimension of the problem and can substantially speed up computations. This approach is similar to the equationbyequation estimation approach in
Carriero, Clark, and Marcellino (2019) that is designed for the reducedform parameterization. But since under our parameterization there is no need to obtain the ‘orthogonalized’ shocks at each iteration as in Carriero, Clark, and Marcellino (2019), the proposed approach is substantially faster. Moreover, under our parameterization the estimation can be parallelized to further speed up computations. This structuralform parameterization, however, raises the issue of variable ordering, that is, the assumed order of the variables might affect the model estimates compared to a standard reducedform TVPVAR. We investigate this issue empirically and find that the variability of the estimates from this structuralform parameterization is comparable to that of the TVPVAR of Primiceri (2005).Next, we adapt the noncentered parameterization of the state space model in FrühwirthSchnatter and Wagner (2010) to our structural TVPVAR representation. Further, for each equation we introduce two indicator variables, one determines whether the VAR coefficients are timevarying or constant, while the other controls whether the elements of the impact matrix are timevarying or not. Hence, each vector , where is the number of endogenous variables, characterizes a hybrid TVPVAR with a particular form of time variation. By treating these indicators as parameters to be estimated, we allow the data to determine the appropriate timevarying structures, in contrast to typical setups where time variation is assumed. The proposed approach therefore is not only flexible—it includes many stateoftheart models routinely used in applied work as special cases—it also induces parsimony to ameliorate overparameterization concerns. This datadriven hybrid TVPVAR can also be interpreted as a Bayesian model average of
hybrid TVPVARs with different forms of time variation, where the weights are determined by the posterior model probabilities
. It follows that forecasts from such a model can be viewed as a forecast combination of a wide variety of hybrid TVPVARs.The estimation is done using Markov chain Monte Carlo (MCMC) methods. Hence, in contrast to earlier attempts to build large TVPVARs, our approach is fully Bayesian and is exact—it simulates from the exact posterior distribution. There are, however, a few challenges in the estimation. First, the dimension of the model is large and there are thousands of latent state processes—timevarying coefficients and stochastic volatilities—to simulate. To overcome this challenge, in addition to using the equationbyequation estimation approach described earlier, we also adopt the precision sampler of
Chan and Jeliazkov (2009)to draw both the timeinvariant and timevarying VAR coefficients, as well as the stochastic volatilities. In our highdimensional setting the precision sampler substantially reduces the computational cost compared to conventional Kalman filter based smoothers. A second challenge in the estimation is that the indicators and the latent states enter the likelihood multiplicatively. Consequently, it is vital to sample them jointly; otherwise the Markov chain is likely to get stuck. We therefore develop algorithms to sample the indicators and the latent states jointly.
Using US datasets of different dimensions, we find evidence that the VAR coefficients and elements of the impact matrix in some, but not all, equations are time varying. In particular, in a formal Bayesian model comparison exercise, we show that there is overwhelming support for the (datadriven) hybrid TVPVAR relative to a few standard benchmarks, including a constantcoefficient VAR with stochastic volatility and a fullfledged TVPVAR in which all the VAR coefficients and error covariances are time varying. We further illustrate the usefulness of the hybrid TVPVAR with a forecasting exercise that involves 20 US quarterly macroeconomic and financial variables. We show that the proposed model forecasts better than many benchmarks. These results suggest that using a datadriven approach to discover the timevarying structures—rather than imposing either constant coefficients or timevarying parameters—is empirically beneficial.
This paper contributes to the budding literature on developing large TVPVARs. Earlier papers include Koop and Korobilis (2013, 2018), who propose fast methods to approximate the posterior distributions of large TVPVARs. Banbura and van Vlodrop (2018) and Götz and Hauzenberger (2018) consider large VARs with only timevarying intercepts. Chan, Eisenstat, and Strachan (2020) model the timevarying coefficients using a factorlike reducedrank structure, whereas Huber, Koop, and Onorante (2019) develop a method that first shrinks the timevarying coefficients, followed by setting the small values to zero. As mentioned above, our estimation approach is exact and fully Bayesian, and the modeling framework is more flexible than many of those in earlier papers. There is also a growing literature on alternative, nonlikelihood based approaches. Examples include Giraitis, Kapetanios, and Price (2013) and Petrova (2019) that allow for the estimation of large TVPVARs without imposing the Choleskytype stochastic volatility, and hence they avoid the ordering issue. Nevertheless, one main advantage of the likelihoodbased approach taken in this paper is that it is flexible and modular. In particular, it is straightforward to incorporate additional useful features into the proposed hybrid model, such as more sophisticated static and dynamic shrinkage priors for VARs (Prüser, 2021; Chan, 2021)
or more flexible error distributions to deal with outliers
(Carriero, Clark, Marcellino, and Mertens, 2021; Bobeica and Hartwig, 2021).The rest of the paper is organized as follows. We first introduce the proposed modeling framework in Section 2. In particular, we discuss how we combine a reparameterization of the reducedform TVPVAR and the noncentered parameterization of the state space model to develop the hybrid TVPVARs. We then describe the shrinkage priors and the posterior sampler in Section 3. It is followed by a Monte Carlo study in Section 4 that demonstrates that the proposed methodology works well and can select the correct timevarying or timeinvariant structure. The empirical application is discussed in detail in Section 5. Lastly, Section 6 concludes and briefly discusses some future research directions.
2 Hybrid TVPVARs
We first introduce a class of models we call hybrid timevarying parameter VARs: VARs in which some equations have timevarying coefficients, whereas coefficients in other equations remain constant. To that end, let be an vector of endogenous variables at time . The TVPVAR of Primiceri (2005) can be reparameterized in the following structural form:
(1) 
where is an vector of timevarying intercepts, are VAR coefficient matrices, is an lower triangular matrix with ones on the diagonal and . The law of motion of the VAR coefficients and logvolatilites will be specified below. Since the system in (1) is written in the structural form, the covariance matrix is diagonal by construction. Consequently, we can estimate this recursive system equation by equation without loss of efficiency.
We note that Carriero, Clark, and Marcellino (2019) pioneer a similar equationbyequation estimation approach for a large reducedform constantcoefficient VAR with stochastic volatility. The main advantage of the structuralform representation is that it allows us to rewrite the VAR as unrelated regressions, and it leads to a more efficient sampling scheme. The main drawback of this representation, however, is that the implied reducedform estimates depend on how the variables are ordered in the system. We will investigate the extent to which these estimates depend on the ordering in Section 5.2.
2.1 An EquationbyEquation Representation
It is convenience to introduce some notations. Let denote the th element of and let represent the th row of . Then, is the intercept and VAR coefficients of the th equation and is of dimension with . Moreover, let denote the free elements in the th row of the contemporaneous impact matrix for . That is, is of dimension with . Then, the th equation of the system in (1) can be rewritten as:
where and . Note that depends on the contemporaneous variables . But since the system is triangular, when we perform the change of variables from to to obtain the likelihood function, the density function remains Gaussian.
If we let , we can further simplify the th equation as:
(2) 
where is of dimension Hence, we have rewritten the TVPVAR in (1) as unrelated regressions. Finally, the coefficients and logvolatilities are assumed to evolve as independent random walks:
(3)  
(4)  
(5) 
where the initial conditions and are treated as unknown parameters to be estimated. The system in (2)–(5) specifies a reparameterization of a standard TVPVAR in which all equations have timevarying parameters and stochastic volatility.
Note that the innovations in (3)(5) are assumed to be independent across equations. This assumption is partly motivated by the concern of proliferation of correlation parameters, especially when is large, if the correlations of the innovations are unrestricted. In addition, for and , this independence assumption is important for extending the setup later so that we can turn on and off the time variation in both equations. In contrast, it is feasible to allow the innovations to to be correlated across equations (with a slight increase of computational cost). In preliminary work we considered such an extension. While the estimation results suggest that the correlation parameters are sizable, this extension leads to only very modest forecast gains (see Appendix D for details). Therefore, in what follows we maintain the independence assumption in (3)(5) as the baseline.
2.2 The NonCentered Parameterization
Next, we introduce a framework that allows the model to determine in a datadriven fashion whether the VAR coefficients and the contemporaneous relations among the endogenous variables in each equation are time varying or constant. For that purpose, we adapt the noncentered parameterization of FrühwirthSchnatter and Wagner (2010) to our hybrid TVPVARs. More specifically, for we consider the following model:
(6)  
(7)  
(8)  
(9) 
where and . Here and are indicator variables that take values of either 0 or 1.
The model in (6)(9) includes a wide variety of popular VAR specifications. For example, assuming that all indicators take the value of 1, the above model is just a reparameterization of the TVPVAR in (2)–(5). To see that, define and . Then, when , it is clear that (6) becomes (2). In addition, we have
Hence, and follow the same random walk processes as in (3) and (4), respectively. We have therefore shown that when , the proposed model reduces to a TVPVAR with stochastic volatility.
For the intermediate case where and , the proposed model reduces to a structuralform reparameterization of the model in Cogley and Sargent (2005), i.e., a TVPVAR with stochastic volatility but the contemporaneous relations among the endogenous variables are restricted to be constant. In the extreme case where , the proposed model then becomes a constantcoefficient VAR with stochastic volatility—a reparameterization of the specification in Carriero, Clark, and Marcellino (2019). More generally, by allowing the indicators and to take different values, we can have a VAR in which only some equations have timevarying parameters. Note that it is straightforward to include a few additional indicators to allow for more flexible forms of time variation. For example, one can replace with two indicators, say, and , which control the time variation in the elements of that correspond to coefficients on own lags and lags of other variables, respectively. The posterior simulator in Section 3.2 can be modified to handle this case, with a slight increase in computation time.
These indicators are not fixed but are estimated from the data. More precisely, we specify that each
follows an independent Bernoulli distribution with success probability
. Similarly for : . These success probabilities and , are in turn treated as parameters to be estimated. In contrast to typical setups where time variation in parameters is assumed (e.g. Cogley and Sargent, 2001, 2005; Primiceri, 2005), here the proposed model puts positive probabilities in simpler models in which the VAR coefficients and the contemporaneous relations among the variables are constant. The values of the indicators are determined by the data, and these timevarying features are turned on only when they are warranted. The proposed model therefore is not only flexible in the sense that it includes a wide variety of specifications popular in applied work as special cases, it also induces parsimony to combat overparameterization concerns.2.3 An Exploration of the Model Space
The proposed hybrid TVPVAR can also be viewed as a Bayesian model average of a wide variety TVPVARs with different forms of time variation. To see that, let denote the vector of indicator variables with . Note that each value of corresponds to a particular TVPVAR in which the time variation of the th equation is characterized by . For example, corresponds to a constantcoefficient VAR with stochastic volatility. Then, the posterior distribution of any model parameters under the proposed model can be represented as the posterior average with respect to , i.e., the posterior model probabilities of the collection of TVPVARs with different forms of time variation, where
denotes the data. For example, the joint distribution of
and , the timevarying VAR coefficients and free elements of the contemporaneous impact matrix, can be represented asFor a small VAR with variables (and the additional assumption that ), Chan and Eisenstat (2018b) estimate all TVPVARs and the corresponding posterior model probabilities. For larger , this approach of computing and sampling from for all possible models is clearly infeasible. In contrast, by including the model indicator in the estimation, we simultaneously explore the parameter space and the model space. This latter approach is convenient and computationally feasible for large systems.
It is also instructive to investigate how the value of the model indicator is determined by the data. To fix ideas, suppose we wish to compare two TVPVARs, represented as and . Let denote the marginal likelihood under model , i.e.,
(10) 
where is the collection of modelspecific timeinvariant parameters and timevarying states (in our setting these parameters and states are common across models and ), is the (completedata) likelihood and
is the prior density. Then, the posterior odds ratio in favor of model
against model is given by:where is the prior odds ratio. It follows that if both models are equally probable a priori, i.e.,
, the posterior odds ratio between the two models is then equal to the ratio of the two marginal likelihoods, or the Bayes factor. More generally, under the assumption that each TVPVAR has the same prior probability, the value of the model indicator
is determined by the marginal likelihood . That is, if the TVPVAR represented by forecasts the data better (as onestepahead density forecasts), the value will have a higher weight.3 Priors and Bayesian Estimation
In this section we first describe in detail the priors on the timeinvariant parameters. We then outline the posterior simulator to estimate the model described in (6)–(9)
3.1 Priors
For notational convenience, stack , , and over , and collect , , and over , and similarly define and . Furthermore, let and . In our model, the timeinvariant parameters are , , , , , and . Below we give the details of the priors on these timeinvariant parameters.
Since , the initial conditions of the VAR coefficients, is highdimensional when is large, appropriate shrinkage is crucial. We assume a Minnesotatype prior on along the lines in Sims and Zha (1998); see also Doan, Litterman, and Sims (1984), Litterman (1986) and Kadiyala and Karlsson (1997). We refer the readers to Koop and Korobilis (2010), Del Negro and Schorfheide (2012) and Karlsson (2013) for a textbook discussion of the Minnesota prior. More specifically, consider , where the prior mean is set to be zero when the variables are in growth rate to induce shrinkage and the prior covariance matrix is blockdiagonal with —here is the prior covariance matrix for . For each we in turn assume it to be diagonal with the th diagonal element set to be:
where
denotes the sample variance of the residuals from regressing
on , . Here the prior covariance matrixdepends on four hyperparameters—
and —that control the degree of shrinkage for different types of coefficients. For simplicity, we set and . These values imply moderate shrinkage for the coefficients on the contemporaneous variables and no shrinkage for the intercepts.The remaining two hyperparameters are and , which control the overall shrinkage strength for coefficients on own lags and those on lags of other variables, respectively. Departing from Sims and Zha (1998), here we allow and to be different, as one might expect that coefficients on lags of other variables would be on average smaller than those on own lags. In fact, Carriero, Clark, and Marcellino (2015) and Chan (2021) find empirical evidence in support of this socalled crossvariable shrinkage. In addition, we treat and as unknown parameters to be estimated rather than fixing them to some subjective values. This is motivated by a few recent papers, such as Carriero, Clark, and Marcellino (2015) and Giannone, Lenza, and Primiceri (2015), which show that by selecting this type of overall shrinkage hyperparameters in a databased fashion, one can substantially improve the forecast performance of the resulting VAR. In addition, this databased Minnesota prior is also found to forecast better than many recently introduced adaptive shrinkage priors such as the normalgamma prior, the DirichletLaplace prior and the horseshoe prior. For example, this is demonstrated in a comprehensive forecasting exercise in Cross, Hou, and Poon (2020).
We assume gamma priors for the hyperparameters and : . We set , and . These values imply that the prior modes are at zero, which provides global shrinkage. The prior means of and are 0.04 and respectively, which are the fixed values used in Carriero, Clark, and Marcellino (2015). Next, following FrühwirthSchnatter and Wagner (2010), the square roots of the diagonal elements of
are independently distributed as mean 0 normal random variables:
. We assume each follow a conventional inversegamma priors: . The success probabilities andare assumed to have beta distributions:
and . Finally, the elements of the initial condition are assumed to be Gaussian: .3.2 The Posterior Simulator
We now turn to the estimation of the model in (6)–(9) given the prior described in the previous section. There are a few challenges in the estimation. First, since becomes degenerate when , making its sampling nonstandard (similarly for ). To sidestep this problem, we will use the parameterization in terms of and . Then, given the posterior draws of , and other parameters, we can recover the posterior draws of and using the definitions and .
Second, since and the indicator enter the likelihood in (6) multiplicatively, it is vital to sample them jointly (similarly for and ); otherwise the Markov chain might get stuck. To see this, consider a simpler sampling scheme in which we simulate given , followed by sampling given . Suppose in the last iteration. Given , does not enter the likelihood and we simply sample it from its state equation. Since the sampled has no relation to the data, the implied time variation in the VAR coefficients would not match the data. Consequently, it is highly likely that the model would prefer no time variation, i.e., . Hence, it is unlikely for the Markov chain to move away from once it is there. It is therefore necessary to sample both and in the same step. In addition, since the pair and enters the likelihood additively, we sample them jointly to further improve efficiency.
Next, define with . Then, one can simulate from the joint posterior distribution using the following posterior sampler that sequentially samples from:

, ;

;

,;

, ;

, ;

, ;

.
Step 2 to Step 7 mainly involve standard sampling techniques and we leave the details to Appendix A. Here we focus on the first step.
Step 1. We sample the four blocks of parameters and jointly to improve efficiency. This is done by first drawing the indicators marginally of —but conditional on other parameters—and then sample and from their joint conditional distribution. The latter of these two steps is straightforward because given and , we have a linear Gaussian state space model in . Specifically, we stack the observation equation (6) over :
where ,
Here note that the matrix depends on the indicators . Next, stack the state equations (7)(8) over :
where is the first difference matrix of dimension . Since is a square matrix with unit determinant, it is invertible. It then follows that
Finally, using standard linear regression results, we have
(11) 
where
(12) 
Since the precision matrix is a band matrix, one can sample efficiently using the algorithm in Chan and Jeliazkov (2009).
To sample marginal of , it suffices to compute the four probabilities that and . To that end, note that
where both the conditional likelihood and the prior density are Gaussian. It turns out that the above integral admits an analytical expression. In fact, using a similar derivation in Chan and Grant (2016), one can show that
(13) 
where and are defined in (12). Then, one can compute the relevant probabilities using the expression in (13). For example, when , and . It follows that
Similarly, we have
where and denote respectively and evaluated at . The probabilities that and can be computed similarly. A draw from this 4point distribution is standard once we normalize the probabilities. The details of the remaining steps are provided in Appendix A.
4 A Monte Carlo Study
In this section we first conduct a series of simulated experiments to assess how well the posterior sampler works in recovering the timevarying structure in the data generating process. We then document the runtimes of estimating the hybrid TVPVARs of different dimensions to assess how well the posterior sampler scales to larger systems.
First, we generate 300 datasets from the hybrid VAR in (6)–(9) with variables and sample size or . We set the vector of indicators by repeating the four combinations three times — that allows us to study the effect of different combinations of timevarying pattens as well as their positions in the system. We generate
, the initial conditions of the VAR coefficients, stochastically as follows. The intercepts are drawn independently from the uniform distribution on the interval
, i.e., . For the VAR coefficients, the diagonal elements of the first VAR coefficient matrix are iid and the offdiagonal elements are from . All other elements of the th () VAR coefficient matrices are iid Finally, the elements of are drawn independently from .If the coefficient is timevarying (i.e., the associated indicator or is 1), it is generated from the state equation (3) or (4) with if is a VAR coefficient and if it is an intercept for . Finally, for the logvolatility processes, we draw and set .
In the Monte Carlo study we use the priors described in Section 3.1 with the following hyperparameters. The prior means of the initial conditions and are set to be zero and , and the prior covariance matrix of is . The hyperparameter of is set so that the implied prior mean of is if it is associated with a VAR coefficient and for an intercept. Finally, we set the hyperparameters of and to be . These values imply that the prior modes are at 0 and 1, whereas the prior mean is 0.5.
Given a dataset and the priors described above, we estimate the hybrid VAR using the posterior sampler in Section 3.2 and obtain the posterior mode of . We repeat this procedure for all the datasets and compute the frequencies of and being one, . The results are reported in Table 1.
Overall, the posterior sampler works well and is able to recover the true timevarying structure in the simulated data on average. While it is harder to pin down the correct value of compared to , the frequencies of identifying the true value of are still reasonably good. In addition, these results substantially improve when the sample size increases from to . All in all, these Monte Carlo results confirm that the proposed hybrid model can recover salient patterns—such as timevarying conditional means and covariances—in the data.
To further investigate the effect of the beta prior on and , we repeat the Monte Carlo experiments but assume a uniform prior on the unit interval , i.e., Hence, both the prior means and modes are 0.5. The Monte Carlo results are similar to the baseline case and they are reported in Appendix D.
Equation  True  True  

1  0  –  0.06  –  0.05  – 
2  0  1  0.04  0.88  0.05  0.93 
3  1  0  0.98  0.25  1.00  0.12 
4  1  1  0.98  0.64  1.00  0.75 
5  0  0  0.02  0.02  0.02  0.00 
6  0  1  0.03  0.96  0.04  0.98 
7  1  0  0.97  0.13  1.00  0.03 
8  1  1  0.95  0.80  1.00  0.94 
9  0  0  0.03  0.00  0.02  0.01 
10  0  1  0.04  0.94  0.10  0.99 
11  1  0  0.94  0.11  1.00  0.02 
12  1  1  0.93  0.88  0.99  0.96 
Next, we document the runtimes of estimating the hybrid TVPVARs of different sizes to assess how well the posterior sampler scales to higher dimensions. More specifically, Table 2 reports the runtimes (in minutes) to obtain 1,000 posterior draws from the hybrid models with variables and time periods. The posterior sampler is implemented in on a standard desktop with an Intel Core i77700 @3.60 GHz processor and 64 GB memory. As a comparison, we also include the corresponding runtimes of fitting the TVPVAR of Primiceri (2005) using the algorithm in Del Negro and Primiceri (2015). Note that the algorithm in Del Negro and Primiceri (2015) samples all the timevarying VAR coefficients in one block and it tends to be very computationally intensive for larger systems. One potential solution is to develop an equationbyequation estimation procedure similar to that in Carriero, Chan, Clark, and Marcellino (2021). Since the algorithm is designed for models with a constant contemporaneous impact matrix, extending it to handle the TVPVAR of Primiceri (2005)—which features a timevarying contemporaneous impact matrix—would be an interesting future research direction.
Hybrid TVPVAR  4  29  94  8  59  188 
Primiceri (2005)  12  209  –  25  415  – 
It is evident from the table that for typical applications with 1530 variables, the proposed model can be estimated reasonably quickly. In addition, using the recursive representation that admits straightforward equationbyequation estimation, fitting the proposed model is much faster than estimating the TVPVAR of Primiceri (2005), even though the former is more flexible.
5 Application: Model Comparison and Forecasting
In this section we fit a large US macroeconomic dataset set to demonstrate the usefulness of the proposed model. After describing the dataset in Section 5.1, we first investigate how different variable orderings affect the estimates from the proposed hybrid TVPVAR relative to the TVPVAR of Primiceri (2005) in Section 5.2. We then present the full sample results in Section 5.3. In particular, we conduct a formal Bayesian model comparison exercise to shed light on the timevarying patterns of the model parameters. We then consider a pseudo outofsample forecasting exercise in Section 5.4. We show that the forecast performance of the proposed model compares favorably to a range of standard benchmarks.
5.1 Data and Prior Hyperparameters
The US dataset for our empirical application consists of 20 quarterly variables with a sample period from 1959Q1 to 2018Q4. It is sourced from the FREDQD database at the Federal Reserve Bank of St. Louis as described in McCracken and Ng (2021). Our dataset contains a variety of standard macroeconomic and financial variables, such as Real GDP, industrial production, inflation rates, labor market variables, money supply and interest rates. They are transformed to stationarity, typically to annualized growth rates. The complete list of variables and how they are transformed is given in Appendix C.
We use the priors described in Section 3.1. In particular, since the data are transformed to growth rates, we set the prior mean of to be zero, i.e., . For the prior hyperparameters on and , we set , and . These values imply that the prior means of and are respectively 0.04 and . For the hyperparameters of the initial conditions