Tests for Forecast Instability and Forecast Failure under a Continuous Record Asymptotic Framework

03/29/2018 ∙ by Alessandro Casini, et al. ∙ Boston University 0

We develop a novel continuous-time asymptotic framework for inference on whether the predictive ability of a given forecast model remains stable over time. We formally define forecast instability from the economic forecaster's perspective and highlight that the time duration of the instability bears no relationship with stable period. Our approach is applicable in forecasting environment involving low-frequency as well as high-frequency macroeconomic and financial variables. As the sampling interval between observations shrinks to zero the sequence of forecast losses is approximated by a continuous-time stochastic process (i.e., an Ito semimartingale) possessing certain pathwise properties. We build an hypotheses testing problem based on the local properties of the continuous-time limit counterpart of the sequence of losses. The null distribution follows an extreme value distribution. While controlling the statistical size well, our class of test statistics feature uniform power over the location of the forecast failure in the sample. The test statistics are designed to have power against general form of insatiability and are robust to common forms of non-stationarity such as heteroskedasticty and serial correlation. The gains in power are substantial relative to extant methods, especially when the instability is short-lasting and when occurs toward the tail of the sample.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Since the seminal contribution of Klein (1969; 1971), economic forecasts had been built upon the presumption that the relationships between economic variables remain stable over time. However, the last decades have been subject to many social-economic episodes and technological advancements that have led economists to reconsider the assumption of model stability. The resonant empirical evidences documented in, among others, Perron (1989) and Stock and Watson (1996) [see also the recent survey by Ng and Wright (2013)

] have motivated the development of econometric methods that detect such instabilities—most work directed toward structural changes—and estimate the actual dates at which economic relationships change. Yet, the issue of parameter insatiability is not limited to model estimation. In the forecasting literature, there has been a widespread concordance that the major issue that prevents good forecasts for economic variables is parameter instability—and structural changes as a special case—[cf.

Banerjee, Marcellino, and Masten (2008), Clements and Hendry (1998, 2006), Elliott and Timmermann (2016), Giacomini (2015), Giacomini and Rossi (2015), Inoue and Rossi (2011), Clark and McCracken (2005), Pesaran, Pettenuzzo, and Timmermann (2006) and Rossi (2013a)].

This paper develops a statistical setting under infill asymptotics to address the issue of testing whether the predictive ability of a given forecast model remains stable over time. Ng and Wright (2013) and Stock and Watson (2003) explain that there has been abundant evidence for which a predictor that has performed well over a certain time period may not perform as well during other subsequent periods. For example, Gilchrist and Zakrajšek (2012) proposed a new credit spread index and showed that a residual component labeled as the excess bond premium—the credit spread adjusted for expected default risk—has considerable predictive content for future economic activity. They documented that this forecasting ability is stronger over the subsample 1985-2010 rather than over the full sample starting from 1973.111They reported that structural change tests provide some statistical evidence for a break in a coefficient associated with financial indicators—more specifically the coefficient on the federal funds rate. Given the latter evidence and the well-documented change in the conduct of monetary policy in the late 1970s and the early 1980s, it seems plausible to split the sample in 1985 (see p. 1709 and footnote 11 in their paper). The latter finding can be attributed to a more developed bond market in the 1985-2010 subsample. Relatedly, Giacomini and Rossi (2010) and Ng and Wright (2013) further examined this finding and found that indeed the predictive ability of commonly used term and credit spreads is unstable and somehow episodic. The latter authors suggested that credit spreads may be more useful predictors of economic activity in a more highly leveraged economy and that recent developments in financial markets translate into credit spreads containing more information than they had previously. We refer to such temporal instability for a given forecasting method as forecast instability or more specifically, as forecast failure. These terminologies are not new to professional forecasters as they were informally introduced by Clements and Hendry (1998) and generalized in econometric terms by Giacomini and Rossi (2009) who interpreted forecast breakdown (or forecast failure) as a situation in which the out-of-sample performance of a forecast model significantly deteriorates relative to its in-sample performance. Our approach is to formally define forecast instability from the economic forecaster’s perspective.222We use the terminology “instability” because not only the deterioration but also the improvement of the performance of a given forecast model over time can provide useful information to the forecaster. We emphasize that a forecast failure may well result from a short period of instability within the out-of-sample and not necessarily require that the instability be systematic in the sense of persisting throughout the whole out-of-sample period. That is, consistency of a forecast model’s performance with expected performance given the past should hold not only throughout the out-of-sample but also in any sub-sample of the latter. Indeed, many documented episodes of forecast failure seemed to arise from parameters nonconstancy data-generating processes over relatively short time periods compared to the total sample size. Hence, the desire of focusing on statistical tests being able to detect short-lasting instabilities is intuitive: if a test for forecast failure needs the deterioration of the forecasting ability to last for, say, at least half of the total sample in order to have sufficiently high power to reject the null hypotheses, then this test would not perform very well in practice because instability can be short-lasting. Furthermore, the occurrence of recurrent structural instabilities or multiple breaks that compensate each other in the out-of-sample might lead a forecast model to perform, on average, in a similar fashion as in the in-sample period. However, should a forecaster know about those recurrent changes she would conceivably revise its forecast model to adapt to the unstable environment. Hence, we introduce the following definition.

Definition 1.1.

(Forecast Instability)

Forecast Instability refers to a situation of either sustained deterioration or improvement of the predictive ability of a given forecast model relative to the historical performance that would had led a forecaster to revise or reconsider its forecast model if she had known the occurrence of such instability. The time lengths of these two distinct periods need not bear any relationship.333Forecast Failure constitutes a special case of the definition—namely, a sustained deterioration of predictive ability.

Th definition poses at the center the economic forecaster and consequently it is not merely a statistical definition; rather, it is based on an equilibrium concept. Since forecasting constitutes a decision theoretic problem, it should be from the forecaster perspective that a given forecast model is deemed to have failed. It is implicit from the definition to distinguish between forecasting method and model. Two forecasters may share the same forecast model—the relationship between the variable of interest and the predictor—but use different methods (e.g., recursive scheme versus rolling scheme). Thus, instability refers to a given method-model pair. The object of the definition is predictive ability. Since the latter can be measured differently by different loss functions, then the definition applies to a given choice of the loss function. A notable aspect of the definition is the reference to the time span of the historical performance and of the putative period of instability. They need not be related. Consider a given forecasting strategy which has performed well during, say, the Great Moderation (i.e., from mid-1980s up to prior the beginning of the Great Recession in 2007). Assume that during the years 2007-2012 this method endures a time of poor performance and returns to perform well thereafter. According to our definition, this episode constitutes an example of forecast instability. However, if one designs the forecasting exercise in such a way that half of the sample is used for estimation and the remaining half for prediction, then this relatively short period of instability gets “averaged-out” from tests which simply compare the in-sample and out-of-sample averages. Conceivably, such tests would not reject the null hypotheses of no forecast failure while it seems that a forecaster would had revised its strategy during the crisis if she had known about such occurring under-performance in the present and immediate future period. Finally, detection of forecast instability does not necessarily mean that a forecast model should be abandoned. In fact, its performance may have improved over time. Yet, even if forecast instability is induced by performance deterioration, a forecaster might not end up switching to a new predictor. For example, entering a state of high variability might lead to poor performance even if the forecast model is still correct. Hence, our definition uses the term reconsider. Continuing with the above example, a forecaster may reconsider the choice of the forecasting window since a longer window may now produce better forecasts while keeping the same forecast model. In other words, knowledge of forecast instability is important because indicates that care must be exercised to assess the source of the changes.444Economists have documented episodes of forecast failure in many areas of macroeconomics. In the empirical literature on exchange rates a prominent forecast failure is associated with the Meese and Rogoff’s puzzle [cf. Meese and Rogoff (1983), Cheung, Chinn, and Garcia Pascual (2005), and Rossi (2013b) for an up-to-date account]. In the context of inflation forecasting, forecast failures have been reported by Atkeson and Ohanian (2001) and Stock and Watson (2009). For forecast instability concerning other macroeconomic variables see the surveys of Stock and Watson (2003) and Ng and Wright (2013).

The theoretical implication is that in this paper our tests for forecast instability shall be based on the local behavior of the sequence of realized forecast losses. This is opposite to existing tests for forecast instability—and classical structural change tests more generally—which instead rely on a global and retrospective methodology merely comparing the average of in-sample losses with the average of out-of-sample losses. While maintaining approximately correct nominal size, our class of test statistics achieves substantial gains in statistical power relative to previous methods. Furthermore, as the initial timing of the instability moves away from middle sample toward the tail of the out-of-sample, the gains in power become considerable.

In this paper, we set out a continuous record asymptotic framework for a forecasting environment where observations at equidistant time intervals are made over a fixed time span with These observations are realizations from a continuous-time model for the variable to be forecast and for the predictor. From these discretely observed realizations we compute a sequence of forecasts using either a fixed, recursive or rolling scheme. To this sequence of forecasts there corresponds a continuous-time process which satisfies mild regularity conditions and that under the null hypotheses possesses a continuous sample-path. We exploit this pathwise property to base an hypothesis testing problem on the relative performance of a given forecast model over time. Under the hypotheses we expect the sequence of losses to display a smooth and stable path. Any discontinuous or jump behavior followed by a (possibly short) period of substantial discrepancy from the same path over the in-sample period provides evidence against the hypotheses. Our asymptotic theory involves a continuous record of observations where we let the sample size grow to infinity by shrinking the sampling interval to zero with the time span kept fixed at , thereby approaching the continuous-time limit.

Our underlying probabilistic model is specified in terms of continuous Itï¿œ semimartingales which are standard building blocks for analysis of macro and financial high-frequency data [cf. Andersen, Bollerslev, Diebold, and Labys (2001), Andersen, Fusari, and Todorov (2016), Bandi and Renò (2016) and Barndorff-Nielsen and Shephard (2004)]; the theoretical methodology is thus related to that of Casini and Perron (2017a), Li, Todorov, and Tauchen (2017), Li and Xiu (2016) and Mykland and Zhang (2009).555Recent work by Li and Patton (2017) extends standard methods for testing predictive accuracy of forecasts to a high-frequency financial setting. The framework is not only useful for high-frequency data; in particular, recent work of Casini and Perron (2017a, 2017b) has adopted this continuous-time approach for modeling time series regression models with structural changes fitted to low-frequency data (e.g., macroeconomic data that are sampled at weekly, monthly, quarterly, annual frequency, etc.). They have showed that this continuous-time approach delivers a better approximation to the finite-sample distributions of estimators in structural change models and inference is more reliable than previous methods based on classical long-span asymptotics.

The classical approach to economic forecasting for macroeconomic variables is to formulate models in discrete-time and then base inference on long-span asymptotics where the sample size increases without bound and the sampling interval remains fixed [cf. Diebold and Mariano (1995), Giacomini and White (2006) and West (1996)

]. There are crucial distinctions between this classical approach and the setting introduced in this paper. Under long-span asymptotics, identification of parameters hinges on assumptions on the distributions or moments of the studied processes [cf. the specification of the null hypotheses in

Giacomini and Rossi (2009)], whereas within a continuous-time framework, unknown structural parameters are identified from the sample paths of the studied processes. Hence, we only need to assume rather mild pathwise regularity conditions for the underlying continuous-time model and avoid any ergodic or weak-dependence assumption. As in Casini and Perron (2017a), our framework encompasses any time series regression model allowing for general forms of non-stationarity such as heteroskedasticty and serial correlation.

Given a null hypotheses stated in terms of the path properties of the sequence of losses, we propose a test statistic which compares the local behavior of the sequence of surprise losses defined as the difference between the out-of-sample and in-sample losses. More specifically, our maximum-type statistic examines the smoothness of the sequence of surprise losses as the continuous-time limit is approached. Under the hypotheses, the continuous-time analogue of the sequence of losses follows a continuous motion and any deviation from such smooth path is interpreted as evidence against the hypotheses. The null distribution of the test statistic is non-standard and follows an extreme value distribution. Therefore, our limit theory exploits results from extreme value theory as elaborated by Bickel and Rosenblatt (1973) and Galambos (1987).666In nonparametric change-point testing, related works are Wu and Zhao (2007) and Bibinger, Jirak, and Vetter (2017).

We propose two versions of the test statistic: one that is self-normalized and one that uses an appropriate estimator of the asymptotic variance. The test statistic is defined as the maximal deviation between the average surprise losses over asymptotically vanishing time blocks. Further, we consider extensions of each of these statistics which use overlapping rather than non-overlapping blocks. Although they should be asymptotically equivalent, the statistics based on overlapping blocks are more powerful in finite-samples. In a framework where one allows for model misspecification, the problem of non-stationarity such as heteroskedastcity and serial correlation in the forecast losses should be taken seriously. Given the block-based form our test statistics we derive an alternative estimator of the long-run variance of the forecast losses. This estimator differs from the popular estimators of

Andrews (1991) and Newey and West (1987) [see Müller (2007) for a review] and it is of independent interest. Finally, we extend results to settings that allow for stochastic volatility, and we conduct a local power analysis and highlight a few differences of our testing framework from the structural change test of Andrews (1993). Related aspects, such as estimating the timing of the instability and covering high-frequency setting with jumps, are being considered in a companion paper.

The rest of the paper is organized as follows. Section 2 introduces the statistical setting, the hypotheses of interest and the test statistics. Section 3 derives the asymptotic null distribution under a continuous record. We discuss the estimation of the asymptotic variance in Section 4. Some extensions and a local power analysis are presented in Section 5. Additional elements that are covered in our companion paper are briefly described in Section 6. A simulation study is contained in Section 7. Section 8 concludes the paper. The supplemental material to this paper contains all mathematical proofs and additional simulation experiments.

2 The Statistical Environment

Section 2.1 introduces the statistical setting with a description of the forecasting problem and the sampling scheme considered throughout. The underlying continuous-time model and its assumptions are introduced in Section 2.2. In Section 2.3 we set out the testing problem and state the relevant null and alternative hypotheses. The test statistics are presented in Section 2.4. Throughout we adopt the following notational conventions. All limits are taken as , or equivalently as , where is the sample size and

is the sampling interval. All vectors are column vectors and for two vectors

and , we write if the inequality holds component-wise. For a sequence of matrices we write if each of its elements is and likewise for If is a non-stochastic vector, denotes the its Euclidean norm, whereas if is a stochastic vector, the same notation is used for the norm. We use to denote the largest smaller integer function and for a set the indicator function of is denoted by . A sequence is i.i.d. if the are independent and identically distributed. We use

to denote convergence in probability and weak convergence, respectively.

is used for the space of positive define real-valued matrices whose elements are cï¿œdlï¿œg. The symbol “” is definitional equivalence.

2.1 The Forecasting Problem

The continuous-time stochastic process is defined on a filtered probability space and takes value in where is the variable to be forecast and

are the predictor variables. The index

is defined as the continuous-time index and we have , where is referred to as the time span. In this paper, will remain fixed. That is, the unobserved process evolves within the fixed time horizon and the econometrician records of its realizations, with a sampling interval , at discrete-time points , where accordingly A continuous record asymptotic framework involves letting the sample size grow to infinity by shrinking the time interval to zero at the same rate so that remains fixed. The index is used for the observation (or tick) times .

The objective is to generate a series of -step ahead forecasts. We shall adopt an out-of-sample precedure whereby splitting the time span into an in-sample and out-of-sample window, and , respectively.777Indeed, corresponds to the in-sample window only for the fixed forecasting scheme to be introduced later—e.g., the rolling scheme only uses the most recent span of data of length . A minor and straightforward modification to this notation should be applied when the recursive and rolling schemes are considered. However, for all methods indicates the artificial separation such that is the beginning of the out-of-sample period. The latter two time horizons are supposed to be fixed and therefore within the in-sample (or prediction) window a sample of size is observed whereas within the out-of-sample (or estimation) window the sample is of size . We consider a general framework that allows for the three traditional forecasting schemes: (1) a fixed forecasting scheme with discrete-time observations ; (2) a recursive forecasting scheme where at time the prediction sample includes observations ; (3) a rolling forecasting scheme where the time span of the rolling window is fixed and of the same length as (i.e., at time the in-sample window includes observations .888Equivalently, the observation times within the rolling widow at the th’s observation are .

The forecasts may be based on a parametric model whose time-

parameter estimates are then collected into the random vector . If no parametric assumption is made, then represents whatever semiparametric or nonparametric estimator used for generating the forecasts. The time- forecast is denoted by , where is some measurable function. The notation indicates that the -time forecast is generated from information contained in a sample of size .999 varies with the forecastis scheme; e.g., for the rolling scheme we have while for the recursive scheme we have .

Next, we introduce a loss function which serves for evaluating the performance of a given forecast model. More specifically, each out-of-sample loss constitutes a statistical measure of accuracy of the -step forecast made at time . However, given the objective of detecting potential instability of a certain forecasting method over time, we need additionally to introduce the in-sample losses where is an in-sample fitted value with varying over the specific in-sample window. That is, for each time- forecast there corresponds a sequence (indexed by ) of in-sample fitted values .101010We have for the fixed scheme, for the recursive scheme and for the rolling scheme. Then, the testing problem turns into the detection of any “systematic difference” between the sequence of out-of-sample and in-sample losses; the formal measure of such difference under our context is provided below.

2.2 The Underlying Continuous-Time Model

The process is a -valued semimartingale on and we further assume that all processes considered in this paper are cï¿œdlï¿œg adapted and possess a -a.s. continuous path on .111111For accessible treatments of the probabilistic elements used in this section we refer to Aït-Sahalia and Jacod (2014), Jacod and Shiryaev (2003), Jacod and Protter (2012), Karatzas and Shreve (1996) and Protter (2005). The continuity property represents a key assumption in our setting and implies that is a continuous Itô semimartignale. The integral form for is given by,


where is a Wiener process, and are the drift and spot covariance process, respectively, and is -measurable. We incorporate model misspeficication into our framework by allowing for a large non-zero drift which adds to the residual process:


where , is a standard Wiener process, is its associated volatility, and is -measurable. In (2.2), the last two terms on the right-hand side account for the residual part of which is not explained by where . We assume so that the factor inflates the infinitesimal mean of the residual component thereby approximating a setting with arbitrary misspecification.

Remark 2.1.

In (2.2), misspecification manifests itself in the form of (time-varying) non-zero conditional mean of the residual process, and in giving rise to serial dependence in the disturbances which in turn leads to dependence in the sequence of forecast losses.121212Asymptotically, these features can be dealt with basic arguments used in the high-frequency financial statistics literature; however, when is not too small one needs methods that are robust in finite-samples to such misspecification-induced properties. More precisely, we will propose an appropriate estimator of the long-run variance of the sequence of forecast losses in Section 4. Hence, this specification is similar in spirit to the near-diffusion assumption of Foster and Nelson (1996) who studied the impact of misspecification in ARCH models. On the other hand, Casini and Perron (2017a) introduced a “large-drift” asymptotics with to deal with non-identification of the drift in their context. Technically, the latter specification implies that as becomes small the drift features larger oscillations that add to the local Gaussianity of the stochastic part. Casini and Perron (2017a) referred to this specification as small-dispersion assumption. Finally, note that the presence of can also be related to the signal plus small Gaussian noise of Ibragimov and Has’minskiǐ (1981) if one sets in their model in Section VII.2.

Assumption 2.1.

We have the following assumptions: (i) The processes and have -a.s. continuous sample paths; (ii) The processes and are locally bounded; (iii) There exists such that -a.s. and with ; (iv) and and the conditional variance (or spot covariance) is defined as , which for all satisfies where denotes the -th element of . Furthermore, for every and , the quantity is bounded away from zero and infinity, uniformly in and ; (v) The disturbance process is orthogonal (in martingale sense) to identically for all .131313The angle brackets notation is used for the predictable quadratic variation process.

Part (i) rules out jump processes from our setting. We relax this restriction in our companion paper; see Section 6. Part (ii) restricts those processes to be locally bounded. These should be viewed as regularity conditions rather than assumptions and are standard in the financial econometrics literature [see Barndorff-Nielsen and Shephard (2004), Li and Xiu (2016) and Li, Todorov, and Tauchen (2017)]; recently, they have been used by Casini and Perron (2017a) in the context of structural change models.

The continuous-time model in (2.1)-(2.2) is not observable. The econometrician only has access to realizations of and with a sampling interval over the horizon . For each is a random vector step function that jumps only at time , and so on. The discretized processes and are assumed to be adapted to the increasing and right-continuous filtration . The increments of a process are denoted by . A seminal result known as Doob-Meyer Decomposition [cf. the original sources are Doob (1953) and Meyer (1967); see also Section III.3 in Protter (2005)] allows us to decompose the semimartingale process into a predictable part and a local martingale part. Hence, it follows that we can write for , where the drift is measurable, and is a continuous local martingale with finite conditional covariance matrix -a.s. . Turning to equation (2.2), the error process , with , is then a continuous local martingale difference sequence taking its values in with finite conditional variance -a.s. Therefore, we express the discretized analogue of (2.2) as

Remark 2.2.

As explained above, we accommodate possible model misspecification by adding the component . In the forecasting literature, often one directly imposes restrictions on the sequence of losses, say, where is a forecast error. There are two main differences from our approach. First, in order to facilitate illustrating our novel framework to the reader, we have chosen, without loss of generality, to express directly the relationship between and while at the same time, allowing for misspecification by including . A second distinction from the classical approach is that the latter imposes restrictions on the sequences of losses such as mixing and ergodicity conditions, covariance stationary and so on. In contrast, our infill asymptotics does not require us to impose any ergodic or mixing condition [cf. Casini and Perron (2017a)].

Finally, we have an additional assumption on the path of the volatility process . This turns out be important because it partly affects the local behavior of the forecast losses.

Assumption 2.2.

For small , define the modulus of continuity of on the time horizon by We assume that for some sequence of stopping times and some

-a.s. finite random variable


The assumption essentially states that is locally bounded and is Lipschitz continuous. Lipschitz volatility is a more than reasonable specification for the macroeconomic and financial data to which our analysis is primarily directed. Indeed, the basic case of constant variance is easily accommodated by the assumption. Time-varying volatility is also covered provided is sufficiently smooth. However, the assumption rules out some standard stochastic volatility models often used in finance. We relax that assumption in Section 5, so that we can extend our results to, for example, stochastic volatility models driven by a Wiener process.

2.3 The Hypotheses of Interest

As time evolves, a forecast model can suffer instability for multiple reasons. However, incorporating model misspecification into our framework necessarily implies that the exact form of the instability is unknown and thus one has to leave it unspecified. This differs from the classical setting for estimation of structural change models [cf. Bai and Perron (1998) and Casini and Perron (2017a)] where (i) the break date is well-defined as it is part of the definition of the econometric problem, and (ii) the form of the instability is explicitly specified through a discrete shift in a regression parameter. In contrast, under our context we remain agnostic regarding both (i) and (ii). There may be multiple dates at which the forecast model suffers instability and they might be interrelated in a complicated way. Forecast instability may manifest itself in several forms, including gradual, smooth or recurrent changes in the predictive relationship between and ; certainly, there could also be discrete shifts in —arguably the most common case in practice—but this is only a possibility in our setting and not an assumption as in structural change models. A forecast failure then reflects the forecaster’s failure to recognize the shift in the predictive power of on . On the other hand, even if one can rule out shits in , a forecast instability may be induced by an increase/decrease in the uncertainty in the data which might result, for example, from changes in the unconditional variance of the target variable. In this case, the predictive ability of on , as described for instance by a parameter remains stable while due to an increase in the unconditional variance of it might become weak and in turn the forecasting power might breakdown. Tests for forecast failure such as those proposed in this paper and the ones proposed in Giacomini and Rossi (2009) are designed to have power against both of the above hypotheses.141414Recently, Perron and Yamamoto (2018) proposed to apply modified versions of classical structural break tests to the forecast failure setting. However, their testing framework and hence their null hypotheses are different from ours because they do not fix a model-method pair but only fix the forecast model under the null.

2.3.1 The Null and Alternative Hypotheses on Forecast Instability

Define at time a surprise loss given by the deviation between the time- out-of-sample loss and the average in-sample loss: for , where is the average in-sample loss computed according to the specific forecasting scheme. One can then define the average of the out-of-sample surprise losses


where denotes the time span of the out-of-sample window.151515By definition is fixed and should not be confused with which indicates the number of observations in the out-of-sample window. Indeed, In the classical discrete-time setting, under the hypotheses of no forecast instability one would naturally test whether has zero mean, where is the pseudo-true value of . If the forecasting perfomance remains stable throughout the whole sample then there should be no systematic surprise losses in the out-of-sample window and thus This reasoning motivated the forecast breakdown test of Giacomini and Rossi (2009). Therefore, under the classical asymptotic setting one exploits time series properties of the process such as ergodicity and mixing together with the representation of the hypotheses by a global moment restriction.161616Global refers to the property that the zero-mean restriction involves the entire sequence of forecast losses. By letting the span , this method underlies the classical approach to statistical inference but does not directly extend to an infill asymptotic setting. Under continuous-time asymptotics, identification of parameters is achieved by properties of the paths of the involved processes and not by moment conditions. This constitutes the key difference and requires one to recast the above hypotheses into an infill setting thereby making use of assumptions on an underlying continuous-time data-generating mechanism which is assumed to govern the observed data.

We begin with observing that the sequence of losses can be viewed as realizations from an underlying continuous-time process , with . That is, consists of temporally integrated forecast losses where is the loss at time and is defined by some transformation of the target variable and of the predictor .171717The definition of uses that so long as the forecast step is small and finite one can approximate by for sufficiently small In order to provide a general theory, we focus on families of loss functions that depend only on the forecast error.181818The most popular loss functions used in economic forecasting are within this category [see Elliott and Timmermann (2016) for a recent incisive account of the literature]. Extension to ad hoc loss functions requires specific treatment that might vary from case to case. We denote this class by and we say that the loss function if for all , where . The class comprises the vast majority of loss functions employed in empirical work, including among others the popular Quadratic loss, Absolute error loss and Linex loss. The following examples illustrate how these loss functions are constructed under our setting. For the rest of this section, assume for simplicity and that is one-dimensional in (2.2).


(: Quadratic Loss)
The Mean Squared Error or Quadratic loss function is symmetric and is by far the most commonly used by practitioners. Given (2.2), we have . Then or with .


(: Linex Loss)
The Linear-exponential or Linex loss was introduced by Varian (1975) and it is an example of asymmetric loss function. By the same reasoning as in the Quadratic loss case, we have or with .

Below we make very mild pathwise assumptions on the process which imply restrictions on . We derive asymptotic results under Lipschitz continuity (in ) of the coefficients of the system of stochastic differential equations driving the data . We apply the techniques of stochastic calculus to formulate our testing problem. To avoid clutter, we introduce the notation and its shorthand .191919The notation implicitly assumes that the same loss function is used for estimation and prediction which in turn implies that the subscript in can be omitted since it can be understood from that of the argument . By Itô Lemma, [cf. Section II.7 in Protter (2005)], under smoothness of ,

Let denote the expectation conditional on the path . The instantaneous mean of is . Note that the latter is a symbolic abbreviation for

Since the coefficients of the original system of stochastic equations are Lipschitz continuous in , one can verify that is also Lipschitz upon regularity conditions on and time- information.

We denote by the class of Lipschitz continuous functions on . Let denote a continuous-time stochastic process that is -a.s. locally bounded and adapted.

Definition 2.1.

The process belongs to if for some sequence of stopping times and some -a.s. finite random variable .

We are in a position to formulate the testing problem in terms of the pathwise property of This implies that the hypotheses are specified in terms of random events which differs from classical hypotheses testing but it is typical under continuous-time asymptotics; see Aït-Sahalia and Jacod (2012) (for many references), Li, Todorov, and Tauchen (2016) and Reiß, Todorov, and Tauchen (2015). We consider the following hypotheses: for any ,202020Precise assumptions will be stated below.


which means that we wish to discriminate between the following two events that divide

The dependence of the hypotheses on is appropriate because each event generates a certain path of on , where . The hypotheses requires a Lipschitz condition to hold on , where is the usual artificial separation date after which the first forecast is made. is taken as given here because the testing problem applies to a specific method-model pair and is part of the chosen forecasting method. From a practical standpoint, it would be helpful if this separation date is such that the forecast model is stable on [see Casini and Perron (2017c) for more details]. The latter property is, however, unknown a priori by the practitioner. We cover this case in Section 6.


(; cont’d)
For the Quadratic loss Itô Lemma yields . If is Lipschitz continuous, then the hypothesis holds.


(; cont’d)
From Itô Lemma, . Consequently, by Itô Isometry [cf. Section 3.3.2 in Karatzas and Shreve (1996) or Lemma 3.1.5 in Øksendal (2000)] and hypotheses is seen to hold under Lipschitz continuity of .212121Recall that composition of Lipschitz functions is Lipschitz and that under our context is Lipschitz because (i) is locally bounded and Lipschitz, and (ii) and remains fixed.

We have reduced the forecast instability problem to examination of the local properties of the path of . However, we still have to face the question on how to use the data to test in practice. Even if we could observe , it would not be clear how to formulate a testing problem on the stability of by using path properties of . The reason is that is always absolutely continuous by definition, and thus it would provide little information on the large deviations of the forecast error . In order to study the local behavior of one needs to consider the small increments of close to time . Leaving the definition of aside for a moment, observe that -a.s. continuity of is equivalent to having the relationship between and holding for any infinitesimal interval of time. For the basic parametric linear model: . Then, the forecast loss is , which is difficult to interpret in rigorous probabilistic terms. However, we can consider its discrete-time analogue. We normalize the forecast error by the factor and redefine .222222Alternatively, . Then, for all , the mean of —conditional on —depends on the parameters of the model and its local behavior can be used as a proxy for the local behavior of the infinitesimal mean of . If the corresponding structural parameters of the continuous-time data-generating process satisfy a Lipschitz continuity in , then—knowing —also should be Lipschitz in the continuous-time limit. Under the hypotheses there should be no break in and an appropriately defined right local average of

should not differ too much from its left local average. That is, one can test for forecast instability by using a two-sample t-test over asymptotically vanishing time blocks.


(; cont’d)
Conditional on , . Thus, . If is Lipschitz continuous, then the hypothesis holds.


(; cont’d)
Similar to the Quadratic loss case, we have . Again, the hypotheses is satisfied if is Lipschitz.

Both examples demonstrate that pathwise assumptions on the data-generating process implies restrictions on the properties of the sequence of loss functions. For the QL example, if there is a structural break at the observation then this would result in the mean of shifting to a new level after time . Given that the same reasoning extends to the sequence of surprise losses, one may consider to construct a test statistic on the basis of the local behavior of the surprise losses over time. If there is no instability in the predictive ability of a certain model, then the sequence of out-of-sample surprise losses should display a certain stability. Under the framework of Giacomini and Rossi (2009), this stability is interpreted in a retrospective and global sense as a zero-mean restriction on the sequence over the entire out-of-sample. In contrast, under our continuous-time setting, this stability manifests itself as a continuity property of the path of the continuous-time counterpart of the sequence.

2.4 The Test Statistics

By inspection of the null hypotheses in (2.5), it is evident that a considerable number of forms of instabilities are allowed. These may result from discrete shifts in a model’s structural parameter and/or in structural properties of the processes considered such as conditional and unconditional moments and so on. This first set of non-stationarities relates to the popular case of structural changes which are designed to be detected with high probability by the structural break tests of, among others, Andrews (1993) and Andrews and Ploberger (1994), Bai and Perron (1998) and Elliott and Müller (2006) in univariate settings and of Qu and Perron (2007) in multivariate settings. However, a forecast instability may be generated by many other forms of non-stationarities against which such classical tests for structural breaks are not designed for and consequently they might have little power against. For example, consider the case of smooth changes in model parameters, or in the unconditional variance of . Even more serious would be the presence of recurrent smooth changes in the marginal distribution of the predictor since in this case the above-mentioned tests are likely to falsely reject too often [cf. Hansen (2000)]. Thus, the null hypotheses of no forecast instability calls for a new statistical hypotheses testing framework. Ideally, in this context one needs a test statistic that retains power against any discontinuity, jump and recurrent switch at any point in the out-of-sample and for any magnitude of the shift. We propose a test statistic which aims asymptotically at distinguishing any discontinuity from a regular Lipschitz continuous motion. We introduce a sequence of two-sample t-tests over asymptotically vanishing adjacent time blocks. This should lead to significant gains in power whenever on fixed time intervals the out-of-sample losses exhibit instabilities of any form such as breaks, jumps and relatively large deviations. Such gains are likely to occur especially when instabilities take place within a small portion of the sample relative to the whole time span—a common case in practice that has characterized many episodes of forecast failure in economics.

Interestingly, for the Quadratic loss function we can exploit properties of the local quadratic variation and propose a self-normalized test statistic. Thus, we separate the discussion on the Quadratic loss from that on general loss functions. Let , . Next, we partition the out-of-sample into blocks each containing observations. Let and for .

2.4.1 Test Statistics under Quadratic Loss

We propose the following statistic

The quantity is a local average of the surprise losses within the block . We have partitioned the out-of-sample window into blocks of asymptotically vanishing length . We consider an asymptotic experiment in which the number of blocks increases at a controlled rate to infinity while the per-block sample size grows without bound at a slower rate than the out-of-sample size . The appeal of the statistic is that a large deviation suggests the existence of either a discontinuity or non-smooth shift in the surprise losses close to time and thus it provides evidence against . We comment on the nature of the normalization in the denominator of below, after we introduce a version of statistic which uses all admissible overlapping blocks of length :

where . Since under the alternative hypotheses the exact location of the change-point–-or possibly the locations of the multiple change-points—within the block might actually affect the power of the -based test in small samples, we indeed find in our simulation study that the test statistic which uses overlapping blocks is more powerful especially when the instability arises in forms other than the simple one-time structural change. Thus, the power of the test is slightly sensible to the actual location of the change-point within the block, with higher power achieved when the change-point is close to either the beginning or the end of the block. In contrast, the statistical power of is uniform over the location of the change-point in the sample. The latter property is not shared by the exiting test of Giacomini and Rossi (2009) given that its power tends to be substantially lower if the instability is not located at about middle sample.

An important characteristic of both and is that they are self-normalized; no asymptotic variance appears in their definition. The reason for why appears in the denominator of, for example, is that even though constitutes a more logical self-normalizing term, it might be close to zero in some cases. This would occur under Quadratic loss if, for example, for all This is not true for the factor .

In addition, observe that allowing for misspecification naturally leads one to deal carefully with artificial serial dependence in the forecast losses in small samples. Thus, we consider a version of the statistics and that are normalized by their asymptotic variance: and similarly,

The quantity standardizes the test statistic so that under the null hypotheses we obtain a distribution-free limit. This can be useful because given the fully non-stationary setting together with the possible consequences of misspecification in finite-samples, standardization by the square-root of the asymptotic variance might lead to a more precise empirical size in small samples. We relegate theoretical details on as well as on its estimation to Section 4 where we also present a discussion about its relation with the choice of the number of blocks.

2.4.2 Test Statistics under General Loss Function

For general loss , we propose the following statistic,

where are defined as in the quadratic case and

with . The interpretation of is essentially the same as of , the only difference arising from the denominator that estimates the within-block variance. A version that uses all overlapping blocks is

where , with