On the computational and statistical complexity of over-parameterized matrix sensing
We consider solving the low rank matrix sensing problem with Factorized Gradient Descend (FGD) method when the true rank is unknown and over-specified, which we refer to as over-parameterized matrix sensing. If the ground truth signal 𝐗^* ∈ℝ^d*d is of rank r, but we try to recover it using 𝐅𝐅^⊤ where 𝐅∈ℝ^d*k and k>r, the existing statistical analysis falls short, due to a flat local curvature of the loss function around the global maxima. By decomposing the factorized matrix 𝐅 into separate column spaces to capture the effect of extra ranks, we show that 𝐅_t 𝐅_t - 𝐗^*_F^2 converges to a statistical error of 𝒪̃ (k d σ^2/n) after 𝒪̃(σ_r/σ√(n/d)) number of iterations where 𝐅_t is the output of FGD after t iterations, σ^2 is the variance of the observation noise, σ_r is the r-th largest eigenvalue of 𝐗^*, and n is the number of sample. Our results, therefore, offer a comprehensive picture of the statistical and computational complexity of FGD for the over-parameterized matrix sensing problem.
READ FULL TEXT