ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data
High-dimensional, low sample-size (HDLSS) data problems have been a topic of immense importance for the last couple of decades. There is a vast literature that proposed a wide variety of approaches to deal with this situation, among which variable selection was a compelling idea. On the other hand, a deep neural network has been used to model complicated relationships and interactions among responses and features, which is hard to capture using a linear or an additive model. In this paper, we discuss the current status of variable selection techniques with the neural network models. We show that the stage-wise algorithm with neural network suffers from disadvantages such as the variables entering into the model later may not be consistent. We then propose an ensemble method to achieve better variable selection and prove that it has probability tending to zero that a false variable is selected. Then, we discuss additional regularization to deal with over-fitting and make better regression and classification. We study various statistical properties of our proposed method. Extensive simulations and real data examples are provided to support the theory and methodology.
READ FULL TEXT