Improving the Interpretability of Neural Sentiment Classifiers via Data Augmentation

09/10/2019 ∙ by Hanjie Chen, et al. ∙ 0

Recent progress of neural network models has achieved remarkable performance on sentiment classification, while the lack of classification interpretation may raise the trustworthy and many other issues in practice. In this work, we study the problem of improving the interpretability of existing sentiment classifiers. We propose two data augmentation methods that create additional training examples to help improve model interpretability: one method with a predefined sentiment word list as external knowledge and the other with adversarial examples. We test the proposed methods on both CNN and RNN classifiers with three benchmark sentiment datasets. The model interpretability is assessed by both human evaluators and a simple automatic evaluation measurement. Experiments show the proposed data augmentation methods significantly improve the interpretability of both neural classifiers.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.