AES Systems Are Both Overstable And Oversensitive: Explaining Why And Proposing Defenses

09/24/2021
by   Yaman Kumar Singla, et al.
25

Deep-learning based Automatic Essay Scoring (AES) systems are being actively used by states and language testing agencies alike to evaluate millions of candidates for life-changing decisions ranging from college applications to visa approvals. However, little research has been put to understand and interpret the black-box nature of deep-learning based scoring algorithms. Previous studies indicate that scoring models can be easily fooled. In this paper, we explore the reason behind their surprising adversarial brittleness. We utilize recent advances in interpretability to find the extent to which features such as coherence, content, vocabulary, and relevance are important for automated scoring mechanisms. We use this to investigate the oversensitivity i.e., large change in output score with a little change in input essay content) and overstability i.e., little change in output scores with large changes in input essay content) of AES. Our results indicate that autoscoring models, despite getting trained as "end-to-end" models with rich contextual embeddings such as BERT, behave like bag-of-words models. A few words determine the essay score without the requirement of any context making the model largely overstable. This is in stark contrast to recent probing studies on pre-trained representation learning models, which show that rich linguistic features such as parts-of-speech and morphology are encoded by them. Further, we also find that the models have learnt dataset biases, making them oversensitive. To deal with these issues, we propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies. We find that our proposed models are able to detect unusual attribution patterns and flag adversarial samples successfully.

READ FULL TEXT

page 26

page 27

page 29

research
12/27/2020

My Teacher Thinks The World Is Flat! Interpreting Automatic Essay Scoring Mechanism

Significant progress has been made in deep-learning based Automatic Essa...
research
01/13/2018

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers

Although various techniques have been proposed to generate adversarial s...
research
11/30/2021

Automated Speech Scoring System Under The Lens: Evaluating and interpreting the linguistic cues for language proficiency

English proficiency assessments have become a necessary metric for filte...
research
05/08/2022

On the Use of BERT for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation

In recent years, pre-trained models have become dominant in most natural...
research
01/08/2019

Explaining AlphaGo: Interpreting Contextual Effects in Neural Networks

In this paper, we propose to disentangle and interpret contextual effect...
research
01/23/2019

Automated Essay Scoring based on Two-Stage Learning

Current state-of-art feature-engineered and end-to-end Automated Essay S...
research
05/19/2023

Self-Reinforcement Attention Mechanism For Tabular Learning

Apart from the high accuracy of machine learning models, what interests ...

Please sign up or login with your details

Forgot password? Click here to reset