1 Mastering Panel ’Metrics
1.1 The Setting
We consider the dynamic linear panel data model
(1) 
where and . Here is the outcome for an observational unit at time ,
is a vector of variables of interest or treatments, whose predictive effect
we would like to estimate, is a vector of covariates or controls including a constant and lags of , and are unobserved unit and time effects that can be correlated to , and is an error term normalized to have zero mean for each unit and time that satisfies the weak exogeneity condition(2) 
We assume that the vectors which collect these variables for the observational unit , are i.i.d. across , and make other conventional regularity assumptions. The main challenge in the estimation of panel data models is how to deal with the unobserved effects. We review two approaches.
1.2 The Fixed Effects Approach
This approach treats the unit and time effects as parameters to be estimated by applying OLS in the model:
where , is an dimensional vector of indicators for observational units with a 1 in the th position and 0’s otherwise, and is a dimensional vector of indicators for time periods with a 1 in the th position and 0’s otherwise. The elements of appearing in front of and are called unit fixed effects and time fixed effects, respectively. The resulting estimator is the fixed effect (FE) estimator. For our purposes, it can be seen as an exactly identified GMM estimator with the score function
where .
The FE estimator is biased with bias of order , due to estimation of many () nuisance parameters with observations, and the bias decreases as becomes large. The estimator approaches the true value as both and become large, but unfortunately the bias of the estimator is too big relative to the order of the stochastic error, resulting in invalid assessment of statistic significance of the estimates. This necessitates the use of bias correction to restore the validity of the statistical inference.
1.3 The AB Approach
This approach eliminates the unit effects
by taking differences across time and uses moment conditions for the variables in differences. Specifically, define the differencing operator
acting on doubly indexed random variables
by creating the difference . Apply this operator to both sides of (1) to obtain:(3) 
where . Note that by (2),
This means that estimation and inference can be done using an overidentified GMM with score function
where . This is the Arellano and Bond (1991) estimator.
The AB estimator enjoys good properties when is very small, but when is even modestly large, it uses many () moment conditions, which results in a bias of order , which can be too large relative to the size of the stochastic error of the estimator. In the latter case statistical inference becomes invalid, and we need to employ bias correction methods to restore its validity.
1.4 GMM under High Dimensionality and Need to Bias Corrections
Both FE and AB are GMM estimators in a highdimensional regime – with either the number of nuisance parameters or the number of moment equations being large.
In the FE approach, the dimension of is low, but the dimension of the nuisance parameter is high. We can approximate this situation as when , while is held fixed. In the AB approach, the number of moment conditions, could be high, so we can approximate this situation as when .
In either regime, there exist regularity conditions such that if is small compared to :^{1}^{1}1For , .
(4) 
then the standard approximate normality and consistency results of the GMM estimator continue to hold, namely
(5) 
where is the
upperleft block of the asymptotic variance of the GMM estimator corresponding to
.^{2}^{2}2Sufficient conditions are given, for example, by Newey and Windmeijer (2009) for GMM problems with and fixed; and by Hahn and Newey (2004), Hahn and Kuersteiner (2011) and FernándezVal and Weidner (2018) for nonlinear panel data models where .The key rate condition (4) can be interpreted as the small bias condition. This condition fails to hold in the FE approach where and , and in the AB approach when is large because and . Both of these failures apply to our empirical setting.
To understand where (4) comes from, let us focus on the exactly identified case where . An asymptotic second order expansion of around gives
where , is a first order bias term coming from the quadratic term of the expansion, and is a higher order remainder such as . Then, (5) holds if both
and , i.e.
The sketch above illustrates that the bias is the bottleneck. If we remove the bias somehow, then we can improve the rate requirement (4) to a weaker condition listed below.
There are several ways of removing the bias:

Analytical bias correction, where we estimate using analytical expressions for the bias and set

Splitsample bias correction, where we split the sample into two parts, compute the estimator on the two parts and to obtain , and then set
In some cases we can average over many splits to reduce variability.^{3}^{3}3 In some cases it is also possible to use the bootstrap and leaveoneout methods for bias correction.
Why does the samplesplitting method work? Assuming that we estimate the same number of nuisance parameters and use the same number of moment conditions in all the parts of the sample, and that these parts are homogenous, then the first order biases of , , and are
so that the first order bias of is
After debiasing, the resulting rate conditions are weaker. In particular, there exist regularity conditions such that if the dimensionality is not overly high:
then the approximate normality and consistency results for the biascorrected GMM estimator continue to hold:^{4}^{4}4Sufficient conditions are given in Kiviet (1995), Hahn and Kuersteiner (2002) and Chudik, Pesaran and Yang (2018) for dynamic linear panel data models and FernándezVal and Weidner (2016) and FernándezVal and Weidner (2018) for nonlinear panel data models.
1.5 The Debiased FE and AB Estimators
To construct analytical debiased FE estimator (DFEA), we need to characterize the first order bias. An analysis similar to Nickell (1981) yields that first order bias obeys:
for
where is the residual of the sample linear projection of on . Note that because the source of the bias is the estimation of the unit fixed effects and the order of the bias is because there are only observations that are informative about each unit fixed effect.^{5}^{5}5There is no bias coming from the estimation of the time fixed effects because the model is linear and we assume independence across . An estimator of the bias can be formed as
where is the fixed effect residual,
and is a trimming parameter such that and as (Hahn and Kuersteiner, 2011).
To implement debiasing by sample splitting, we need to determine the partition of the data. For the debiased FE estimator via sample splitting (DFESS), we split the panel along the time series dimension because the source of the bias is the estimation of the unit fixed effects. Thus, following Dhaene and Jochmans (2015), the parts contain the observations and , where and are the ceiling and floor functions. This partition preserves the time series structure and delivers two panels with the same number of unit fixed effects, where there are observations that are informative about each unit fixed effect. For the debiased AB estimator via sample splitting (DABSS), we split the panel along the cross section dimension because the source of the bias is the number of moment conditions relative to the sample size. Thus, the parts contain the observations and . This partition delivers two panels where the number of observations relative to the number of moment conditions is half of the original panel. Note that there are multiple possible partitions because the ordering of the observations along the cross section dimension is arbitrary. We can therefore average across multiple splits to reduce variability.
2 Democracy and Growth
We revisit the application to the causal effect of democracy on economic growth of Acemoglu et al. (forthcoming) using the econometric methods described in Section 1. To keep the analysis simple, we use a balanced subpanel of 147 countries over the period from 1987 through 2009 extracted from the data set used in Acemoglu et al. (forthcoming). The outcome variable is the logarithm of GDP per capita in 2000 USD as measured by the World Bank for country at year . The treatment variable of interest is a democracy indicator constructed in Acemoglu et al. (forthcoming)
, which combines information from several sources including Freedom House and Polity IV. It characterizes whether countries have free and competitive elections, checks on executive power, and an inclusive political process. We report some descriptive statistics of the variables used in the analysis in the online supplemental Appendix.
We control for unobserved country effects, time effects and rich dynamics of GDP using the linear panel model (1), where includes four lags of . The weak exogeneity condition (2) implies that democracy and past GDP are orthogonal to contemporaneous and future GDP shocks and that these shocks are serially uncorrelated (since includes the lagged values of ).
In addition to the instantaneous or shortrun effect of a transition to democracy to economic growth measured by the coefficient , we are interested in a permanent or longrun dynamic effect. This effect in the dynamic linear panel model (1) is
(6) 
where are the coefficients corresponding to the lags of
We consider the FE and a onestep AB estimators as well as their debiased versions (DFE and DAB). Indeed, the raw AB and FE fail to satisfy the small bias condition: the AB approach relies on moment conditions to estimate parameters with observations, after using the first five periods as initial conditions, so that , which is not close to zero; the FE approach estimates parameters with observations, after using the first four periods as initial conditions, yielding , which is not close to zero.
To debias the estimators, we consider both analytical and split sample bias corrections. For the fixed effect approach, DFEA implements the analytical debiasing with , whereas DFESS implements debiasing by sample splitting. We consider two versions of the debiasing via samplesplitting for AB, where DABSS1 uses one random split and DABSS5 uses the average of five random splits.
For each estimator, we report analytical standard errors clustered at the country level and bootstrap standard errors based on resampling countries with replacement. The estimates of the longrun effect are obtained by pluggingin estimates of the coefficients in the expression (
6). We use the delta method to construct analytical standard errors clustered at the country level, and resample countries with replacement to construct bootstrap standard errors. There is no need to recompute the analytical standard errors for debiased estimators, because the ones obtained for the uncorrected estimators remain valid for the bias corrections. We also report bootstrap standard errors for the debiased estimators.Table 1 presents the empirical results.^{6}^{6}6We obtained the estimates with the commands plm and pgmm of the package plm in R. FE finds that a transition to democracy increases economic growth by almost 1.9% in the first year and 16% in the long run, while AB finds larger impacts of 4% and 21% but less precisely estimated. We find that the debiasing changes the estimates by a significant amount in both statistical sense (relative to the standard error) and economic sense (relative to the uncorrected estimates). The debiased estimators, DFE and DAB, find that a transition to democracy increases economic growth by about 2.35.2% in the first year, and about 2526% in the long run. Interestingly, the two debiased approaches produce very similar estimates. Moreover, the results coincide with the results obtained using the method of Hahn, Hausman and Kuersteiner (2005), as reported in Acemoglu et al. (forthcoming). We believe that the estimates reported here as well as the later estimates reported in Acemoglu et al. (forthcoming) represent an adequate, state of the art analysis. Of course, it would be interesting to continue to explore other modern, perhaps even more refined, econometric approaches to thoroughly examine the empirical question.
We conclude with comments on the standard errors. The analytical standard errors are smaller than the bootstrap standard errors for the splitsample bias corrections. These differences might indicate that the analytical standard errors miss the additional sampling error introduced by the estimation in smaller panels.The analytical correction produces more precise estimates than the splitsample correction.
Initial and Debiased FE  Initial and Debiased AB  
FE  DFEA  DFESS  AB  DABSS1  DABSS5  
Short Run Effect  1.89  2.27  2.44  3.94  5.22  4.53 
of Democracy  (0.65)  (1.50)  
()  [0.64]  [0.64]  [0.96]  [1.52]  [1.83]  [1.91] 
1st lag of log GDP  1.15  1.23  1.30  1.00  0.98  1.03 
(0.05)  (0.06)  
[0.05]  [0.05]  [0.08]  [0.06]  [0.07]  [0.08]  
2nd lag of log GDP  0.12  0.14  0.13  0.06  0.05  0.07 
(0.06)  (0.06)  
[0.05]  [0.05]  [0.08]  [0.06]  [0.07]  [0.07]  
3rd lag of log GDP  0.07  0.09  0.13  0.04  0.04  0.06 
(0.04)  (0.04)  
[0.04]  [0.04]  [0.06]  [0.04]  [0.04]  [0.04]  
4th lag of log GDP  0.08  0.08  0.08  0.08  0.08  0.08 
(0.02)  (0.03)  
[0.02]  [0.03]  [0.04]  [0.03]  [0.03]  [0.03]  
Longrun effect  16.05  25.91  25.69  20.97  26.46  25.24 
of democracy  (6.67)  (9.51)  
()  [6.63]  [9.31]  [12.12]  [9.38]  [10.72]  [11.29] 
All the specifications include country and year effects. Analytical clustered standard errors at the country level are shown in parentheses. Bootstrap standard errors based on 500 replications are shown in brackets.
References
 (1)
 Acemoglu et al. (forthcoming) Acemoglu, Daron, Suresh Naidu, Pascual Restrepo, and James A. Robinson. forthcoming. “Democracy Does Cause Growth.” Journal of Political Economy.
 Arellano and Bond (1991) Arellano, Manuel, and Stephen Bond. 1991. “Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations.” The Review of Economic Studies, 58(2): 277–297.
 Chudik, Pesaran and Yang (2018) Chudik, Alexander, M. Hashem Pesaran, and JuiChung Yang. 2018. “Halfpanel jackknife fixedeffects estimation of linear panels with weakly exogenous regressors.” Journal of Applied Econometrics, 33(6): 816–836.
 Dhaene and Jochmans (2015) Dhaene, Geert, and Koen Jochmans. 2015. “Splitpanel Jackknife Estimation of Fixedeffect Models.” The Review of Economic Studies, 82(3): 991–1030.
 FernándezVal and Weidner (2016) FernándezVal, Iván, and Martin Weidner. 2016. “Individual and time effects in nonlinear panel models with large N, T.” Journal of Econometrics, 192(1): 291–312.
 FernándezVal and Weidner (2018) FernándezVal, Iván, and Martin Weidner. 2018. “Fixed Effects Estimation of LargeTPanel Data Models.” Annual Review of Economics, 10(1): 109–138.
 Hahn and Kuersteiner (2011) Hahn, Jinyong, and Guido Kuersteiner. 2011. “Bias reduction for dynamic nonlinear panel models with fixed effects.” Econometric Theory, 27(06): 1152–1191.
 Hahn and Kuersteiner (2002) Hahn, Jinyong, and Guido M. Kuersteiner. 2002. “Asymptotically Unbiased Inference for a Dynamic Panel Model with Fixed Effects When Both N and T Are Large.” Econometrica, 70(4): 1639–1657.
 Hahn and Newey (2004) Hahn, Jinyong, and Whitney Newey. 2004. “Jackknife and Analytical Bias Reduction for Nonlinear Panel Models.” Econometrica, 72(4): 1295–1319.
 Hahn, Hausman and Kuersteiner (2005) Hahn, Jinyong, Jerry Hausman, and Guido Kuersteiner. 2005. “Bias Corrected Instrumental Variables Estimation for Dynamic Panel Models with Fixed Effects.” Boston University  Department of Economics Boston University  Department of Economics  Working Papers Series WP2005024.
 Kiviet (1995) Kiviet, Jan F. 1995. “On bias, inconsistency, and efficiency of various estimators in dynamic panel data models.” J. Econometrics, 68(1): 53–78.
 Newey and Windmeijer (2009) Newey, Whitney K., and Frank Windmeijer. 2009. “Generalized Method of Moments With Many Weak Moment Conditions.” Econometrica, 77(3): 687–719.
 Nickell (1981) Nickell, Stephen. 1981. “Biases in Dynamic Models with Fixed Effects.” Econometrica, 49(6): 1417–1426.
Appendix
The online supplemental Appendix contains the data, descriptive statistics, and code in R and Stata for the empirical application.
Appendix A Supplemental Appendix
Mean  SD  Dem = 1  Dem = 0  

Democracy  0.62  0.49  1.00  0.00 
Log(GDP)  7.58  1.61  8.09  6.75 
Number Obs.  3,381  3,381  2,099  1,282 
Comments
There are no comments yet.