 # Supplementary Material for "Estimation of a Multiplicative Correlation Structure in the Large Dimensional Case"

Supplementary Material for "Estimation of a Multiplicative Correlation Structure in the Large Dimensional Case"

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 8 Supplementary Material

This section contains supplementary materials to the main article. SM 8.1 contains additional materials related to the Kronecker product (models). SM 8.2 gives a lemma characterising a rate for , which is used in the proofs of limiting distributions of our estimators. SM 8.3, SM 8.4, and SM 8.5 provide proofs of Theorem LABEL:thm_asymptotic_normality_MD_when_D_is_unknown, Theorem LABEL:prop_Haihan_score_functions_and_second_derivatives, and Theorem LABEL:thm_one_step_estimator_asymptotic_normality, respectively. SM 8.6 gives proofs of Theorem LABEL:thm_overidentification_test_fixed_dim and Corollary LABEL:coro_diagonal_asymptotics. SM 8.7 contains miscellaneous results.

### 8.1 Additional Materials Related to the Kronecker Product

The following lemma proves a property of Kronecker products.

###### Lemma 8.1.

Suppose and that are real symmetric and positive definite matrices of sizes , respectively. Then

 log(A1⊗A2⊗⋯⊗Av) =logA1⊗Ia2⊗⋯⊗Iav+Ia1⊗logA2⊗Ia3⊗⋯⊗Iav+⋯+Ia1⊗Ia2⊗⋯⊗logAv.
###### Proof.

We prove by mathematical induction. We first give a proof for ; that is,

 log(A1⊗A2)=logA1⊗Ia2+Ia1⊗logA2.

Since are real symmetric, they can be orthogonally diagonalized: for , where is orthogonal, and is a diagonal matrix containing those eigenvalues of . Positive definiteness of ensures that their Kronecker product is positive definite. Then the logarithm of is:

 log(A1⊗A2)=log[(U1⊗U2)⊺(Λ1⊗Λ2)(U1⊗U2)]=(U1⊗U2)⊺log(Λ1⊗Λ2)(U1⊗U2), (8.1)

where the first equality is due to the mixed product property of the Kronecker product, and the second equality is due to a property of matrix functions. Next,

 log(Λ1⊗Λ2)=diag(log(λ1,1Λ2),…,log(λ1,a1Λ2))=diag(log(λ1,1Ia2Λ2),…,log(λ1,a1Ia2Λ2)) =diag(log(λ1,1Ia2)+log(Λ2),…,log(λ1,a1Ia2)+log(Λ2)) =diag(log(λ1,1Ia2),…,log(λ1,a1Ia2))+diag(log(Λ2),…,log(Λ2)) =log(Λ1)⊗Ia2+Ia1⊗log(Λ2), (8.2)

where the third equality holds only because and have real positive eigenvalues only and commute for all (Higham (2008) p270 Theorem 11.3). Substitute (8.2) into (8.1):

 log(A1⊗A2)=(U1⊗U2)⊺log(Λ1⊗Λ2)(U1⊗U2)=(U1⊗U2)⊺(logΛ1⊗Ia2+Ia1⊗logΛ2)(U1⊗U2) =(U1⊗U2)⊺(logΛ1⊗Ia2)(U1⊗U2)+(U1⊗U2)⊺(Ia1⊗logΛ2)(U1⊗U2) =logA1⊗Ia2+Ia1⊗logA2.

We now assume that this lemma is true for . That is,

 log(A1⊗A2⊗⋯⊗Ak) =logA1⊗Ia2⊗⋯⊗Iak+Ia1⊗logA2⊗Ia3⊗⋯⊗Iak+⋯+Ia1⊗Ia2⊗⋯⊗logAk. (8.3)

We prove that the lemma holds for . Let and .

 log(A1⊗A2⊗⋯⊗Ak⊗Ak+1)=log(A1−k⊗Ak+1)=logA1−k⊗Iak+1+Ia1⋯ak⊗logAk+1 =logA1⊗Ia2⊗⋯⊗Iak⊗Iak+1+Ia1⊗logA2⊗Ia3⊗⋯⊗Iak⊗Iak+1+⋯+ Ia1⊗Ia2⊗⋯⊗logAk⊗Iak+1+Ia1⊗⋯⊗Iak⊗logAk+1,

where the third equality is due to (8.3). Thus the lemma holds for . By induction, the lemma is true for . ∎

Next we provide two examples to illustrate the necessity of an identification restriction in order to separately identify log parameters.

###### Example 8.1.

Suppose that . We have

 logΘ∗1=(a11a12a12a22)logΘ∗2=(ccb11b12b12b22)

Then we can calculate

 logΘ∗=logΘ∗1⊗I2+I2⊗logΘ∗2=⎛⎜ ⎜ ⎜⎝a11+b11b12a120b12a11+b220a12a120a22+b11b120a12b12a22+b22⎞⎟ ⎟ ⎟⎠.

Log parameters can be separately identified from the off-diagonal entries of because they appear separately. We now examine whether log parameters can be separately identified from diagonal entries of . The answer is no. We have the following linear system

 Ax:=⎛⎜ ⎜ ⎜⎝1010100101100101⎞⎟ ⎟ ⎟⎠⎛⎜ ⎜ ⎜⎝a11a22b11b22⎞⎟ ⎟ ⎟⎠=⎛⎜ ⎜ ⎜ ⎜⎝\sbrlogΘ∗11\sbrlogΘ∗22\sbrlogΘ∗33\sbrlogΘ∗44⎞⎟ ⎟ ⎟ ⎟⎠=:d.

Note that the rank of is 3. There are three effective equations and four unknowns; the linear system has infinitely many solutions for . Hence one identification restriction is needed to separately identify log parameters . We choose to set .

###### Example 8.2.

Suppose that . We have

 logΘ∗1=(a11a12a12a22)logΘ∗2=(ccb11b12b12b22)logΘ∗3=(ccc11c12c12c22)

Then we can calculate

 logΘ∗=logΘ∗1⊗I2⊗I2+I2⊗logΘ∗2⊗I2+I2⊗I2⊗logΘ∗3= ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝a11+b11+c11c12b120a12000c12a11+b11+c220b120a1200b120a11+b22+c11c1200a1200b12c12a11+b22+c22000a12a12000a22+b11+c11c12b1200a1200c12a22+b11+c220b1200a120b120a22+b22+c11c12000a120b12c12a22+b22+c22⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠.

Log parameters can be separately identified from off-diagonal entries of because they appear separately. We now examine whether log parameters can be separately identified from diagonal entries of . The answer is no. We have the following linear system

 Ax:=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝101010101001100110100101011010011001010110010101⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝a11a22b11b22c11c22⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝\sbrlogΘ∗11\sbrlogΘ∗22\sbrlogΘ∗33\sbrlogΘ∗44\sbrlogΘ∗55\sbrlogΘ∗66\sbrlogΘ∗77\sbrlogΘ∗88⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠=:d.

Note that the rank of is 4. There are four effective equations and six unknowns; the linear system has infinitely many solutions for . Hence two identification restrictions are needed to separately identify log parameters . We choose to set .

### 8.2 A Rate for ∥^VT−V∥∞

The following lemma characterises a rate for , which is used in the proofs of limiting distributions of our estimators.

###### Lemma 8.2.

Let Assumptions LABEL:assu_subgaussian_vector(i) and LABEL:assu_mixing be satisfied with . Suppose if . Then

 ∥^VT−V∥∞=Op\del√lognT.
###### Proof.

Let denote , similarly for , where . Let denote , similarly for where .

 ∥^VT−V∥∞:=max1≤a,b≤n2|^VT,a,b−Va,b|=max1≤i,j,k,ℓ≤n|^VT,i,j,k,ℓ−Vi,j,k,ℓ| ≤max1≤i,j,k,ℓ≤n\envert1TT∑t=1~yt,i~yt,j~yt,k~yt,ℓ−1TT∑t=1˙yt,i˙yt,j˙yt,k˙yt,ℓ (8.4) +max1≤i,j,k,ℓ≤n\envert1TT∑t=1˙yt,i˙yt,j˙yt,k˙yt,ℓ−E[˙yt,i˙yt,j˙yt,k˙yt,ℓ] (8.5) +max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1~yt,i~yt,j\del1TT∑t=1~yt,k~yt,ℓ−\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ (8.6) +max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ−E[˙yt,i˙yt,j]E[˙yt,k˙yt,ℓ] (8.7)

#### Display (8.5)

Assumption LABEL:assu_subgaussian_vector(i) says that for all , there exist absolute constants such that

 E\sbrexp\delK2|yt,i|r1≤K1% for all i=1,…,n.

By repeated using Lemma LABEL:lemmaexponentialtail in Appendix LABEL:secArateofconvergence, we have for all , every , absolute constants such that

 P(|yt,i|≥ϵ) ≤exp\sbr1−(ϵ/b1)r1 P(|˙yt,i|≥ϵ) ≤exp\sbr1−(ϵ/c1)r1 P(|˙yt,i˙yt,j|≥ϵ) ≤exp\sbr1−(ϵ/b2)r3 P(|˙yt,i˙yt,j−E[˙yt,i˙yt,j]|≥ϵ) ≤exp\sbr1−(ϵ/c2)r3 P(|˙yt,i˙yt,j˙yt,k˙yt,ℓ|≥ϵ) ≤exp\sbr1−(ϵ/b3)r4 P(|˙yt,i˙yt,j˙yt,k˙yt,ℓ−E[˙yt,i˙yt,j˙yt,k˙yt,ℓ]|≥ϵ) ≤exp\sbr1−(ϵ/c3)r4

where and . Use the assumption to invoke Theorem LABEL:thmbernsteininequality followed by Lemma LABEL:lemmabernsteinrate in Appendix LABEL:sec_oldappendixB to get

 max1≤i,j,k,ℓ≤n\envert1TT∑t=1˙yt,i˙yt,j˙yt,k˙yt,ℓ−E˙yt,i˙yt,j˙yt,k˙yt,ℓ=Op\del√lognT. (8.8)

#### Display (8.7)

We now consider (8.7).

 max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ−E[˙yt,i˙yt,j]E[˙yt,k˙yt,ℓ] ≤max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ−E[˙yt,k˙yt,ℓ] (8.9) +max1≤i,j,k,ℓ≤n\envertE[˙yt,k˙yt,ℓ]\del1TT∑t=1˙yt,i˙yt,j−E[˙yt,i˙yt,j]. (8.10)

Consider (8.9).

 max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ−E˙yt,k˙yt,ℓ ≤max1≤i,j≤n\del\envert1TT∑t=1˙yt,i˙yt,j−E˙yt,i˙yt,j+∣∣E˙yt,i˙yt,j∣∣max1≤k,ℓ≤n\envert1TT∑t=1˙yt,k˙yt,ℓ−E˙yt,k˙yt,ℓ =\delOp\del√lognT+O(1)Op\del√lognT=Op\del√lognT

where the first equality is due to Lemma LABEL:lemmaexponentialtail(ii) in Appendix LABEL:secArateofconvergence, Theorem LABEL:thmbernsteininequality and Lemma LABEL:lemmabernsteinrate in Appendix LABEL:sec_oldappendixB. Now consider (8.10).

 max1≤i,j,k,ℓ≤n\envertE[˙yt,k˙yt,ℓ]\del1TT∑t=1˙yt,i˙yt,j−E[˙yt,i˙yt,j] ≤max1≤k,ℓ≤n|E[˙yt,k˙yt,ℓ]|max1≤i,j≤n\envert1TT∑t=1˙yt,i˙yt,j−E˙yt,i˙yt,j=Op\del√lognT

where the equality is due to Lemma LABEL:lemmaexponentialtail(ii) in Appendix LABEL:secArateofconvergence, Theorem LABEL:thmbernsteininequality and Lemma LABEL:lemmabernsteinrate in Appendix LABEL:sec_oldappendixB. Thus

 max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ−E[˙yt,i˙yt,j]E[˙yt,k˙yt,ℓ]=Op\del√lognT. (8.11)

#### Display (8.4)

We first give a rate for . The index is arbitrary and could be replaced with . Invoking Lemma LABEL:lemmabernsteinrate in Appendix LABEL:sec_oldappendixB, we have

 max1≤i≤n|¯yi−μi|=max1≤i≤n\envert1TT∑t=1(yt,i−μi)=Op\del√lognT. (8.12)

Then we also have

 max1≤i≤n|¯yi|=max1≤i≤n|¯yi−μi+μi|≤max1≤i≤n|¯yi−μi|+max1≤i≤n|μi|=Op\del√lognT+O(1)=Op(1). (8.13)

We now consider (8.4):

 max1≤i,j,k,ℓ≤n\envert1TT∑t=1~yt,i~yt,j~yt,k~yt,ℓ−1TT∑t=1˙yt,i˙yt,j˙yt,k˙yt,ℓ.

With expansion, simplification and recognition that the indices are completely symmetric, we can bound (8.4) by

 max1≤i,j,k,ℓ≤n\envert¯yi¯yj¯yk¯yℓ−μiμjμkμℓ (8.14) +4max1≤i,j,k,ℓ≤n\envert¯yi\del¯yj¯yk¯yℓ−μjμkμℓ (8.15) +6max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1yt,iyt,j\del¯yk¯yℓ−μkμℓ (8.16) +4max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1yt,iyt,jyt,k\del¯yℓ−μℓ. (8.17)

We consider (8.14) first. (8.14) can be bounded by repeatedly invoking triangular inequalities (e.g., inserting terms like ) using Lemma LABEL:lemmaexponentialtail(ii) in Appendix LABEL:secArateofconvergence, (8.13) and (8.12). (8.14) is of order . (8.15) is of order by a similar argument. (8.16) and (8.17) are of the same order using a similar argument provided that both and are ; these follow from Lemma LABEL:lemmaexponentialtail(ii) in Appendix LABEL:secArateofconvergence, Theorem LABEL:thmbernsteininequality and Lemma LABEL:lemmabernsteinrate in Appendix LABEL:sec_oldappendixB. Thus

 max1≤i,j,k,ℓ≤n\envert1TT∑t=1~yt,i~yt,j~yt,k~yt,ℓ−1TT∑t=1˙yt,i˙yt,j˙yt,k˙yt,ℓ=Op(√logn/T). (8.18)

#### Display (8.6)

We now consider (8.6).

 max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1~yt,i~yt,j\del1TT∑t=1~yt,k~yt,ℓ−\del1TT∑t=1˙yt,i˙yt,j\del1TT∑t=1˙yt,k˙yt,ℓ ≤max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1~yt,i~yt,j\del1TT∑t=1\del~yt,k~yt,ℓ−˙yt,k˙yt,ℓ (8.19) +max1≤i,j,k,ℓ≤n\envert\del1TT∑t=1˙yt,k