1 Introduction
The Generalized Method of Moments (GMM) of Hansen & Singleton (1982)
is a powerful estimation framework which does not require the model to be fully specified parametrically. Under regularity conditions, the estimates are consistent and asymptotically normal. In particular, the moment conditions should uniquely identify the finite dimensional parameters. This is very difficult to verify in practice and, as noted in
Newey & McFadden (1994), is often assumed. Yet, when identification fails or nearly fails, the Central Limit Theorem provides a poor finite sample approximation for the distribution of the estimates. This has motivated a vast amount of research on tests which are robust to identification failure. As discussed in the literature review, much of this work has focused on tests for the full parameter vector. Potentially conservative confidence intervals for scalar parameters can then be built by projecting confidence sets for the full parameter vector
(Dufour & Taamouti, 2005) or using a Bonferroni approach (McCloskey, 2017).The contribution of this paper is twofold: First, it introduces a quasiJacobian matrix which is singular under both local (firstorder) and global identification failure and is informative about the coefficients involved in the failure. This is the main contribution of the paper and provides an approach similar to Cragg & Donald (1993) and Stock & Yogo (2005) but in a nonlinear setting. Second, the information from the first step allows for twostep identification robust subvector inference, akin to type I inference in Andrews & Cheng (2012) but without a priori knowledge of the identification structure.
To detect identification failures, this paper constructs a quasiJacobian
matrix which corresponds to the best linear approximation of the sample moments function over a region of the parameters where these moments are close to zero, as defined by a bandwidth. To find the best linear approximation, two loss functions are considered: the supremum norm measures the largest difference between the moments and its approximation while the leastsquares criterion focuses on the average difference. The supnorm approximation provides strong and intuitive results while leastsquares can be easily computed by OLS using the moments as a dependent variable.
The asymptotic behaviour of the quasiJacobian matrix, computed under these two loss functions, is studied under four identification regimes: strong, semistrong,^{1}^{1}1Semistrong identification is also known as nearlyweak identification (Antoine & Renault, 2009). higherorder local and weak (or set) identification. The GMM estimator is consistent and asymptotically normal in the first two regimes, consistent but not asymptotically normal in the third and is inconsistent in the fourth. Hence, the last two regimes correspond to settings where the finite sample distribution of the estimator is poorly approximated by standard asymptotics. Under (semi)strong identification, the quasiJacobian matrix is asymptotically equivalent to the usual Jacobian matrix. After rescaling, it is asymptotically nonsingular. Under higherorder, weak or set identification the quasiJacobian matrix is asymptotically singular with eigenvalues vanishing at rate determined by the bandwidth used in the approximation and the nature of the identification failure. Furthermore, the quasiJacobian matrix is vanishing in the span of the identification failure, i.e. directions in which identification fails.
Building on these results, this paper constructs a twostep procedure for testing linear hypotheses on the parameter of the form:
(1) 
for a given restriction matrix with and . Assuming there is evidence of identification failure, presented by a small value of the smallest eigenvalue of the quasiJacobian matrix, the two steps used to conduct inference can be summarized as follows:^{2}^{2}2Under strong and semistrong identification, standard inference using the Wald, QLR or LM test will be valid. Lack of evidence for weak and higherorder identification would indicate that these tests can be used.

The first step splits the parameter vector into two sets of parameters: one set of parameters needs to be fixed given evidence that these might be weakly, set or higherorder identified.
is also fixed to match the null hypothesis (
1). Another set of parameters, for which there is no evidence of identification failure, will be assumed to follow (semi)strong asymptotics. 
The second step relies on projection inference^{3}^{3}3See e.g. Scheffe (1953); Dufour (1990); Dufour & Taamouti (2005, 2007). for
and the parameters that need to be fixed while concentrating out the remaining parameters. The test statistic needs to be robust to identification failure. One can use the S, K or CQLR statistic of
Stock & Wright (2000), Kleibergen (2005) and Andrews & Mikusheva (2016b), for instance.
Step 2 has previously been discussed in the literature.^{4}^{4}4See e.g. Kleibergen (2005); Andrews & Mikusheva (2016b), among others. The main challenge to implementing this step in practice has been in determining which nuisance parameters are (semi)strongly identified when the others are fixed. When such decomposition is known exante and identification strength depends on the value of the (semi)strongly^{5}^{5}5The term (semi)strong will refer to cases where identification can be either strong or semistrong. identified parameters, Andrews & Cheng (2012) show how to conduct uniformly valid inference. In this paper, this exante knowledge is not required since the quasiJacobian is vanishing on the span of the identification failure. In practice, a cutoff is required to distinguish between matrices that are vanishing from those that are not. A ruleofthumb, similar to that of Stock & Yogo (2005), is provided to construct this cutoff when detecting weak/set as well as higherorder identification. It relies on a Nagar approximation of the size distortion under semistrong asymptotics.
The twostep approach described above is shown to yield tests that are asymptotically valid under certain conditions. In particular, it is assumed that the search for the restrictions in the first step is sequential, nested and predetermined. In practice, the researcher fixes an increasing number of coefficients until identification is restored, according to the quasiJacobian
. This more disciplined approach avoids the difficulties of studying datadriven search procedures which would complicate the analysis. Sequential procedures fit naturally in settings where some parameters are more credibly identified than others. The search procedure is shown to restore point identification with probability going to
. If the remaining parameters are (semi)strongly identified,^{6}^{6}6Weak or higherorder identification of these parameters can be detected using the above, so this is not particularly restrictive assuming these are the only other possible identification regimes. then the second step yields valid inference procedures as discussed in the previous literature.Also, under strong and semistrong identification, the linear approximation can be used to construct estimates that are asymptotically equivalent to the GMM estimator. This approach effectively replaces the nonsmooth/discontinuous moments with smoothed linear moments making global optimization simple. This may be of practical interest. Finally, the quasiJacobian
can be used in the usual sandwich formula when the moments are nonsmooth as in quantile IV and SMM estimation of discrete choice models.
MonteCarlo simulations illustrate the large sample behaviour of the quasiJacobian matrix and the twostep inference procedure in several designs. Those include a nonlinear leastsquares model where the nuisance parameter is not identified. This is similar to simulations in Andrews & Cheng (2012); Cheng (2015) but without assuming the identification structure is known.
The approach is then applied to two empirical settings. The first application considers the Euler equation in U.S. data. This is a well known example where identification is suspected to fail. The methods developed in this paper suggest that the discount rate is (semi)strongly identified while the riskaversion parameter is poorly identified as suggested in Stock & Wright (2000). Some investigation into the source of the identification failure reveals that the moments are highly redundant and amount to a single moment condition.^{7}^{7}7This implies that one should use one of the singulary and identification robust tests developed in Andrews & Guggenberger (2019). The second application considers quantile IV estimation of the demand for fish (Chernozhukov et al., 2007). The results suggest weak identification of the price elasticity of demand.
Structure of the Paper
After a review of the literature and an overview of the notation used in the paper, Section 2 introduces the setting, the linear approximations, precise definitions of the identification regimes considered and the main assumptions used in the paper. Section 3 derives the asymptotic behaviour of the quasiJacobian matrix. Section 4 describes the twostep inference procedures in more details including: the Algorithms used to determine which parameters to fix, the rulesofthumb for choosing the cutoffs and the asymptotic results for the inference procedures. Section 5 provides a MonteCarlo example to illustrate some of the results from the previous sections. An empirical example is provided in Section 6. Section 7 concludes. Appendices A and B provide the proofs for the main results of Sections 3 and 4 respectively. The Supplement consists of Appendices C, D, E, F, G and H which provides additional and preliminary results for the main text and their proofs as well as additional MonteCarlo and Empirical results.
Related Literature
The literature on the identification of economic models is quite vast. An extensive review is given in Lewbel (2018). Within this literature, this paper mainly relates to three topics: local and global identification of finite dimensional parameters in the population, detecting identification failure in finite samples and identification robust inference.
Koopmans & Reiersol (1950) provide one of the earliest general formulation of the identification problem at the population level. To paraphrase the authors, the main problem is to determine whether the distribution of the data, assumed to be generated from a given class of models, is consistent with one, and only one, set of structural parameters. In the likelihood setting, Fisher (1967); Rothenberg (1971) give sufficient conditions for local and global identification of the structural parameters as the unique solution to a nonlinear system of equations. These include the wellknown rank condition and strict convexity. For GMM, Komunjer (2012) introduced weaker conditions for global identification. In the present paper, singularity of the quasiJacobian will appear when either global or local identification fails for a large class of moment conditions.
In linear models, global identification amounts to a rank condition on the slope of the moments. This insight was used to construct several pretesting procedures in linear IV models for identification failure (Cragg & Donald, 1993; Stock & Yogo, 2005). Pretests based on the null of strong identification were given by Hahn & Hausman (2002) in linear IV and Inoue & Rossi (2011); Bravo et al. (2012) in nonlinear models. Note that pretesting for strong identification in the first step can be problematic for twostep inference procedures when power is low in the first step. For nonlinear models, Wright (2003) tests the local identification condition with a rank test at every point of a robust confidence set. Antoine & Renault (2017) rely on a distorted Jstatistic to detect local identification failure. Arellano et al. (2012) develop a test for underidentification when a single coefficient is unidentified. In this paper, identification strength is summarized by the smallest eigenvalue of the quasiJacobian matrix under weak and set identification. This is both convenient and easy to communicate. Residual curvature also matters when pretesting for higherorder identification as discussed in Section 4.2.2.
Given the impact of (near) identification failure on standard inferences,^{8}^{8}8See e.g. Choi & Phillips (1992); Dufour (1997); Staiger & Stock (1997) in the case of IV regression. a large body of literature has developed identification robust tests. Most consider inference of the full parameter vector.^{9}^{9}9See e.g. Anderson & Rubin (1949); Stock & Wright (2000); Moreira (2003); Kleibergen (2005); Andrews & Mikusheva (2016b); Chen et al. (2018). Few consider the topological features of the identified set to conduct inferences, with the notable exception of Andrews & Mikusheva (2016a). For subvector inferences, a common approach is to construct a confidence set for the full vector and project it on the dimension of interest (Dufour & Taamouti, 2005, 2007) or to use a Bonferroni correction (McCloskey, 2017). These methods might be conservative.^{10}^{10}10However, as discussed in Section 4, Remark 2, when the nuisance parameters are completely unidentified projection inference may actually have exact asymptotic coverage. To increase power, one can concentrate out nuisance parameters that are known to be strongly identified. A series of papers starting with Andrews & Cheng (2012)^{11}^{11}11These include Andrews & Cheng (2013, 2014); Cheng (2015) and Han & McCloskey (2019); Cox (2017). considers uniformly valid subvector inferences in a class of model where the identification structure is known and identification strength is driven by some (semi)strongly identified coefficients. As discussed in Andrews & Mikusheva (2016b), computing the least favorable distribution required for their uniform (type II) inference may be numerically challenging or unfeasible in some settings. Under higherorder local identification, the estimates are consistent but with nonstandard limiting distribution (Rotnitzky et al., 2000; Dovonon & Hall, 2018). This issue is known^{12}^{12}12For instance, van der Vaart (1998) when discussion higherorder Taylor expansions in Chapter 3.3, argues that 1[1]”it is necessary to determine carefully the rate of all terms in the expansion […] before neglecting the ‘remainder’.” but much less studied than weak and set identifications. Dovonon et al. (2019) study the properties of identification robust tests in secondorder identified models. Lee & Liao (2018) show how to conduct standard inference in secondorder identified models with known identification structure.
Notation
For any matrix (or vector) , is the Frobenius (Euclidian) norm of . For any rectangular matrix
, the singular value
refers to the eigenvalue of . refer to the largest and smallest value of , respectively. With some abuse of notation, these singular values will be referred to as eigenvalues. For a weighting matrix , the norm is computed as . For any two positive sequences , ; ; . Fora sequence of random variables and
positive sequence, ; .2 Setting and Assumptions
Following Hansen & Singleton (1982), the econometrician wants to estimate the solution vector to the system of unconditional moment equations:
(2) 
where , a compact subset of , . , is a sample of iid or stationary random variables. Throughout, it is assumed that at least one such exists.^{13}^{13}13This can be achieved in misspecified models by recentering the moments: where . The population moments are allowed to depend on , as in Stock & Wright (2000). is assumed to be continuously differentiable on .
Given the sample moments and a sequence of positive definite weighting matrices , the GMM estimator solves the minimization problem:
where .
2.1 LinearApproximations and the quasiJacobian Matrix
The quasiJacobian matrix is defined below as the slope of a local linear approximation under a given norm.
Definition 1.
(SupNorm and LeastSquares Approximations) Let be a kernel function and a bandwidth. The supnorm approximation solves:
(3) 
where . The leastsquares approximation solves:
(4) 
where . The quasiJacobian refers to the computed using either the leastsquares (LS) or supnorm () approximation.
The supnorm approximation solves a nonsmooth optimization problem and is thus more computationally demanding. However, the theory for is very intuitive and it will be quite useful to understand the relation between the quasiJacobian and identification failure. In practice, it will be more convenient to compute the leastsquares approximation:
The two integrals can be approximated using MonteCarlo methods such as importance sampling, MarkovChain MonteCarlo and Sequential MonteCarlo methods
(Robert & Casella, 2004). In this paper, quasiMonteCarlo integration with the lowdiscrepancy Sobol sequence was used and provided satisfying results. See e.g. Owen (2003); Lemieux (2009) for an overview of quasiMonte Carlo integration.Implementation is straightforward: the Sobol sequence provides a grid for over which and are evaluated. One then simply regresses the evaluated moments on the grid points and an intercept using weighted leastsquares with as weights. If has compact support, one can omit all grid points with from the regression. The quasiJacobian
collects the slope coefficients in this weighted linear regression.
The theory for , while similar to , involves additional topological arguments and the convergence of a quasiposterior under higherorder, weak and set identification making the intuition somewhat more difficult to convey.
For linear models such as OLS or linear IV, the approximation is exact and one would find and . The quasiJacobian is close to singular where the regressors are nearly multicollinear in OLS or when the instruments are not sufficiently relevant in IV. The rank of is thus informative about the identification failure in these models. This extends to nonlinear models.
The following gives an heuristic description of the behaviour of the
quasiJacobian when identification holds or fails. Formal results will be provided in the next section. First note that the kernel and bandwidth play a very important role here as they select all potential solutions for the moment condition (2). When the moment equations have a unique solution , then holds only in small neighborhoods of  with high probability. If, in addition, is smooth then the discrepancy becomes:so that is a smoothed approximation of the usual Jacobian matrix.
In locally point identified models, the Jacobian and quasiJacobian will have full rank. Local, or firstorder, identification failure appears when is singular. implies that the eigenvalues of the quasiJacobian are informative about local identification failure.
When the model is set identified there are, by definition, at least two solutions to the moment equations (2). The linear approximation implies that:
For small this implies:
Given that , this implies that the quasiJacobian must be close to singular in large samples. Both and will determine how close to singular it will be. Overall, both local and global identification failures imply near singularity of the quasiJacobian in large samples.
2.2 Identification Regimes
The following describes the four identification regimes considered in this paper. Their implications for the GMM estimator are summarized in Table 1. Examples 1, 2 illustrate the definitions.
Identification Regime  consistent?  Rate of convergence  Limiting distribution 

Strong  Yes  Gaussian  
SemiStrong  Yes  slower than  Gaussian 
HigherOrder  Yes  or slower  nonGaussian 
Weak or Set  No    nonGaussian 
Example 1 (NonLinear LeastSquares).
Example 2 (Possibly Noninvertible MA(1) Model).
This example is adapted from Gospodinov & Ng (2015). Consider the MA(1) model:
where is iid with mean , variance
and skewness
known. Using the moments and only identifies when . Assuming invertibility () restores point identification. Gospodinov & Ng (2015) show that when , the additional information provided by allows to identify in the population without imposing invertibility.Definition 2.
(Point Identification) The model is point identified if such that , :
(5) 
where , is a nonstochastic positive semidefinite weighting matrix.
Definition 2 corresponds to the case where is unique and thus globally identified. Additional regularity conditions combined with this assumption imply that is consistent for (see e.g. Newey & McFadden, 1994, Theorem 2.6).
Definition 3.
(Strong Identification) The model is strongly identified if it is point identified and and such that implies:
(6) 
Definition 3 is satisfied when the Jacobian has full rank, its smallest eigenvalue is bounded below and, around . With additional regularity conditions, it implies that is asymptotically Gaussian (see e.g. Newey & McFadden, 1994, Theorem 3.2). Standard inferences using the Wald, QLR and LM test are asymptotically valid.
Example 1 (Continued).
The Jacobian of the moments evaluated at implies the following:
Note that is the only eigenvalue of the matrix on the lefthand side which implies that . bounded away from zero implies that the eigenvalues of are bounded below as well.
Example 2 (Continued).
The estimating moments are given by:
Suppose that is bounded away from , and is bounded away from . Point identification holds since: unless , or . It can also be shown that the eigenvalues of the Jacobian are bounded below when is bounded away from zero.
Definition 4.
(SemiStrong Identification) The model is semistrongly identified if it is point identified and

, such that implies:
(7) 
,

for any such that :

for any :
Definition 4 ii. implies that the Jacobian can be vanishing in one or several directions  but not too fast. When , conditions iii.iv. also imply that the secondorder term is vanishing. As a result, the moments remain approximately linear around , as in Definition 3. Together with additional regularity conditions this implies that, after rescaling, will be asymptotically Gaussian. However, the convergence is slower than the usual rate (Antoine & Renault, 2009; Andrews & Cheng, 2012). Standard inferences using the Wald, QLR and LM tests are asymptotically valid.
Example 1 (Continued).
Consider the drifting sequence with and : if and .
Definition 5.
(HigherOrder Local Identification) The model is locally identified at a higher order if it is point identified and , for together with projection matrices satisfying , when such that implies:
(8) 
Definition 5 corresponds to cases where the moments are not approximately linear around . As a result, the first higherorder terms affect the limiting distribution of the GMM estimator . Together with additional regularity conditions, this assumption implies that some components of converges at a rate to a nonGaussian limiting distribution. Wald, QLR and LM statistics have non limiting distributions (see e.g. Rotnitzky et al., 2000; Dovonon & Hall, 2018); standard inferences are not asymptotically valid.
Example 2 (Continued).
Suppose that and . Condition iii.a. holds since there is a unique solution and the moments are continuous. Omitting the third moment condition, the Jacobian becomes:
which is singular and implies firstorder identification failure of the model. Taking the derivative again:
Note that
is the eigenvector which spans the null space of the Jacobian, and both secondorder derivatives are nonsingular on the span of
which implies secondorder identification (see Dovonon & Hall, 2018). Indeed, consider the parametrization , then for :The conditions are then satisfied by taking small enough. More generally, Gospodinov & Ng (2015) show that firstorder identification generally fails when and .
Definition 6.
(Weak and Set Identification) The model is said to be weakly or set identified if there exists at least two in the weakly identified set:
(9) 
Definition 6 occurs when global identification fails or nearly fails. Under strong, semistrong and higherorder identification, a robust and conservative confidence set would concentrate around a single point . Definition 6 collects all models where this phenomenon does not occur. The GMM estimator is typically not consistent (Staiger & Stock, 1997; Stock & Wright, 2000; Andrews & Cheng, 2012) and has nonGaussian limiting distribution. Standard inferences using the Wald, QLR and LM tests are not asymptotically valid.
Definition 6 nests the definition of Stock & Wright (2000) who consider a drifting sequence of moments:
where , are two functions satisfying Definition 3, for instance. They show that both and are inconsistent even though is consistent for a fixed . Definition 6 also nests the setting of Andrews & Cheng (2012) where the identification strength for is determined by a drifting sequence of a (semi)strongly identified scalar coefficient .
Example 1 (Continued).
Consider the drifting sequence . Take for any , then . As a result .
Example 2 (Continued).
Consider the drifting sequence of moments . The moment conditions become:
This implies that . As a result, is not a singleton when and .
2.3 Main Assumptions
The following provides the main assumptions on the moments , weighting matrix , kernel and bandwidth to derive the results in Section 3 for .
Assumption 1.
(Bandwidth, Kernel)

(Bandwidth) . , and as ,

(Compact Kernel) is Lipschitzcontinuous on with for , for ,

(Exponential Kernel) is exponential in , i.e. such that , . Define and assume , as .
In line with the heuristic discussion above, the bandwidth is assumed to be small. Condition i. ensures that it converge to at a slower than rate, but faster than a rate. When , would also capture secondorder nonlinearities under (semi)strong identification. When , a Law of the Iterated Logarithm can be invoked to set:
(10) 
so that .^{14}^{14}14See also Andrews & Soares (2010); Andrews & Cheng (2012) for choices of such sequences. In smaller samples, one can also set where is a (e.g. ) quantile of a distribution (recall that ). Two types of kernels are considered. Compact kernels (condition i.), are used in both supnorm and leastsquares approximations. The Lipschitzcontinuity condition simplifies some of the proofs in Section 3, but numerical experiments showed almost no numerical difference with the uniform kernel . Exponential kernels are considered only for the leastsquares approximation. A simple example is , the Gaussian density, which provides a quasiBayesian interpretation to as discussed in Section 3.2. Again, there was only negligible numerical differences in the quasiJacobian computed with the compact and the exponential kernel in the examples considered in this paper.
Assumption 2.
(Sample Moments, Weighting Matrix)

(Uniform CLT, Tightness) the empirical process converges weakly to a Gaussian process, as ; ,

(Discoverability of ) the weakly identified set satisfies:

(Stochastic Equicontinuity) uniformly in ,

(Smoothness) is continuously differentiable on ; uniformly in ,

(Weighting Matrix) , is Lipschitz continuous in , such that , .
The highlevel conditions in Assumption 2 are quite common in GMM estimation. Condition i. allows for nonsmooth and possibly discontinuous sample moments as in quantileIV (Chernozhukov & Hansen, 2005) or SMM estimation (Pakes & Pollard, 1989). For primitive conditions see van der Vaart & Wellner (1996) for iid and Dedecker & Louhichi (2002) for strictly stationary timeseries data. Condition ii. ensures that the weakly identified set can be conservatively estimated using so that all directions of the identification failure can be detected.^{15}^{15}15For justidentified models and the exponential kernel, one can use instead. For the exponential kernel, one can see that . Since is exponential only appears as a multiplicative constant in which cancels out in
. A similar argument appears in the proof of the Bernsteinvon Mises Theorem in Bayesian statistics
(van der Vaart, 1998; Chernozhukov & Hong, 2003). Conditions iii. is the usual stochastic equicontinuity condition (Andrews, 1994). Condition iv. is only required under strong identification. Definition 4 provides stronger conditions to control the higherorder terms. It is not required under higherorder and weak identification. Condition v. is automatically satisfied for, the identity matrix, but also the optimal weighting matrix
under uniform consistency for for iid data or the HAC estimator for timeseries data and additional conditions on its eigenvalues as well as the Lipschitz continuity. Given the generality of the highlevel assumptions, the results accommodate models where a (semi)strongly identified nuisance parameter is concentrated out:The results could be further extended to well identified infinite dimensional nuisance parameters ; this is left to future research.
3 Asymptotic Behaviour of the Linear Approximations
This section derives the asymptotic behaviour of the pair under strong and semistrong identification and characterizes the behaviour of the quasiJacobian under higherorder and weak/set identification. The supnorm and leastsquares approximations are treated separately. Table 2 summarizes the results. At the population level, the results imply (by taking and ) that the quasiJacobian is the usual Jacobian for firstorder globally identified models and is singular under either local or global identification failure. This provides a simple characterization of firstorder and global identification failure for GMM in the population.
Identification Regime  Asymptotics for  Asymptotics for 

Strong  Gaussian  Jacobian 
SemiStrong  Gaussian  Jacobian 
HigherOrder  NonGaussian  bandwidth 
Weak or Set  NonGaussian  bandwidth 
Note: The results in the two bottom rows hold over all directions in which the model is weakly identified, i.e. in Definition 6; and all directions in which the moments are locally polynomial of order in Definition 5.
3.1 Supnorm approximation
Theorem 1.
Theorem 1 shows that the supnorm approximation is asymptotically equivalent to the usual expansion . This implies that
is a consistent estimator for the sandwich formula when computing standard errors. This can be particularly useful when
is nonsmooth or discontinuous.Theorem 2.
Under semistrong identification, is asymptotically Gaussian. The rate of convergence of each coefficient depends on the eigenvalues of  i.e. the singular values of ^{16}^{16}16
Consider the singular value decomposition
where is the diagonal matrix of singular values. Then ; this implies that is an singular value with multiplicity .  and its eigenvectors. In practice, the standard errors adjust for the rate of convergence automatically, similarly to series and sieve inferences (Pagan & Ullah, 1999; Chen & Pouzo, 2015), so that the usual statistic is asymptotically Gaussian. And again, can be used in the sandwich formula to compute standard errors. The scaled convergence of in Theorem 2 has implications in terms of convergence of the spectral decomposition of . Indeed, let be the th right singular vector of ^{17}^{17}17 is also an orthogonal eigenvector of