Optimal variable selection in multi-group sparse discriminant analysis
This article considers the problem of multi-group classification in the setting where the number of variables p is larger than the number of observations n. Several methods have been proposed in the literature that address this problem, however their variable selection performance is either unknown or suboptimal to the results known in the two-group case. In this work we provide sharp conditions for the consistent recovery of relevant variables in the multi-group case using the discriminant analysis proposal of Gaynanova et al., 2014. We achieve the rates of convergence that attain the optimal scaling of the sample size n, number of variables p and the sparsity level s. These rates are significantly faster than the best known results in the multi-group case. Moreover, they coincide with the optimal minimax rates for the two-group case. We validate our theoretical results with numerical analysis.
READ FULL TEXT