Quantifying the Task-Specific Information in Text-Based Classifications

10/17/2021
by   Zining Zhu, et al.
0

Recently, neural natural language models have attained state-of-the-art performance on a wide variety of tasks, but the high performance can result from superficial, surface-level cues (Bender and Koller, 2020; Niven and Kao, 2020). These surface cues, as the “shortcuts” inherent in the datasets, do not contribute to the *task-specific information* (TSI) of the classification tasks. While it is essential to look at the model performance, it is also important to understand the datasets. In this paper, we consider this question: Apart from the information introduced by the shortcut features, how much task-specific information is required to classify a dataset? We formulate this quantity in an information-theoretic framework. While this quantity is hard to compute, we approximate it with a fast and stable method. TSI quantifies the amount of linguistic knowledge modulo a set of predefined shortcuts – that contributes to classifying a sample from each dataset. This framework allows us to compare across datasets, saying that, apart from a set of “shortcut features”, classifying each sample in the Multi-NLI task involves around 0.4 nats more TSI than in the Quora Question Pair.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset