Multi-Grained-Attention Gated Convolutional Neural Networks for Sentence Classification
The classification task of sentences is very challenging because of the limited contextual information they contain. Current methods based on Convolutional Neural Networks (CNNs) for sentence classification do not have a simple and effective mechanism to determine which features in the network are critical and could become large enough that affect the classification result. In this paper, we propose a new CNN model named Multi-Grained-Attention Gated CNN (MGAGCNN), which generates attention weights from the context window of different sizes based on the Multi-Grained-Attention (MGA) gating mechanism. Through the experiments of comparison and attention visualization, we demonstrate that with the help of MGA gating mechanism, our model could learn more meaningful and comprehensive abstract features and adequately identify the critical features, and enhance their influence on the prediction of sentence category. In addition, we propose an activation function named Natural Logarithm rescaled Rectified Linear Unit (NLReLU). Experimental results show that our model could achieve a general accuracy improvement ranging from 0.4 to 1.7 the strong baseline methods on four out of the six tasks, and NLReLU could achieve comparable performances on MGAGCNN to other well-known activation functions.
READ FULL TEXT