Optimal Subspace Estimation Using Overidentifying Vectors via Generalized Method of Moments

by   Jianqing Fan, et al.

Many statistical models seek relationship between variables via subspaces of reduced dimensions. For instance, in factor models, variables are roughly distributed around a low dimensional subspace determined by the loading matrix; in mixed linear regression models, the coefficient vectors for different mixtures form a subspace that captures all regression functions; in multiple index models, the effect of covariates is summarized by the effective dimension reduction space. Such subspaces are typically unknown, and good estimates are crucial for data visualization, dimension reduction, diagnostics and estimation of unknown parameters. Usually, we can estimate these subspaces by computing moments from data. Often, there are many ways to estimate a subspace, by using moments of different orders, transformed moments, etc. A natural question is: how can we combine all these moment conditions and achieve optimality for subspace estimation? In this paper, we formulate our problem as estimation of an unknown subspace S of dimension r, given a set of overidentifying vectors { v_ℓ}_ℓ=1^m (namely m > r) that satisfy E v_ℓ∈S and have the form v_ℓ = 1/n∑_i=1^n f_ℓ(x_i, y_i), where data are i.i.d. and each function f_ℓ is known. By exploiting certain covariance information related to v_ℓ, our estimator of S uses an optimal weighting matrix and achieves the smallest asymptotic error, in terms of canonical angles. The analysis is based on the generalized method of moments that is tailored to our problem. Our method is applied to aforementioned models and distributed estimation of heterogeneous datasets, and may be potentially extended to analyze matrix completion, neural nets, among others.


page 1

page 2

page 3

page 4


Subspace Perspective on Canonical Correlation Analysis: Dimension Reduction and Minimax Rates

Canonical correlation analysis (CCA) is a fundamental statistical tool f...

Distance-Based Independence Screening for Canonical Analysis

This paper introduces a new method named Distance-based Independence Scr...

Estimating covariance and precision matrices along subspaces

We study the accuracy of estimating the covariance and the precision mat...

PAC-Bayes Bounds for High-Dimensional Multi-Index Models with Unknown Active Dimension

The multi-index model with sparse dimension reduction matrix is a popula...

Impossibility of dimension reduction in the nuclear norm

Let S_1 (the Schatten--von Neumann trace class) denote the Banach space ...

Generalized bounds for active subspaces

The active subspace method, as a dimension reduction technique, can subs...

itdr: An R package of Integral Transformation Methods to Estimate the SDR Subspaces in Regression

Sufficient dimension reduction (SDR) is a successful tool in regression ...

Please sign up or login with your details

Forgot password? Click here to reset