Investigating the Working of Text Classifiers
Text classification is one of the most widely studied task in natural language processing. Recently, larger and larger multilayer neural network models are employed for the task motivated by the principle of compositionality. Almost all of the methods reported use discriminative approaches for the task. Discriminative approaches come with a caveat that if there is no proper capacity control, it might latch on to any signal even though it might not generalize. With use of various state-of-the-art approaches for text classifiers, we want to explore if the models actually learn to compose meaning of the sentences or still just use some key lexicons. To test our hypothesis, we construct datasets where the train and test split have no direct overlap of such lexicons. We study various text classifiers and observe that there is a big performance drop on these datasets. Finally, we show that even simple regularization techniques can improve performance on these datasets.
READ FULL TEXT