Corruption Is Not All Bad: Incorporating Discourse Structure into Pre-training via Corruption for Essay Scoring

10/13/2020
by   Farjana Sultana Mim, et al.
0

Existing approaches for automated essay scoring and document representation learning typically rely on discourse parsers to incorporate discourse structure into text representation. However, the performance of parsers is not always adequate, especially when they are used on noisy texts, such as student essays. In this paper, we propose an unsupervised pre-training approach to capture discourse structure of essays in terms of coherence and cohesion that does not require any discourse parser or annotation. We introduce several types of token, sentence and paragraph-level corruption techniques for our proposed pre-training approach and augment masked language modeling pre-training with our pre-training method to leverage both contextualized and discourse information. Our proposed unsupervised approach achieves new state-of-the-art result on essay Organization scoring task.

READ FULL TEXT

page 8

page 12

research
10/30/2020

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

We introduce Sentence-level Language Modeling, a new pre-training object...
research
09/30/2020

Neural RST-based Evaluation of Discourse Coherence

This paper evaluates the utility of Rhetorical Structure Theory (RST) tr...
research
09/15/2023

Structural Self-Supervised Objectives for Transformers

This thesis focuses on improving the pre-training of natural language mo...
research
11/06/2020

Unleashing the Power of Neural Discourse Parsers – A Context and Structure Aware Approach Using Large Scale Pretraining

RST-based discourse parsing is an important NLP task with numerous downs...
research
05/14/2019

A Unified Linear-Time Framework for Sentence-Level Discourse Parsing

We propose an efficient neural framework for sentence-level discourse an...
research
02/27/2023

Systematic Rectification of Language Models via Dead-end Analysis

With adversarial or otherwise normal prompts, existing large language mo...
research
04/23/2017

Discourse-Based Objectives for Fast Unsupervised Sentence Representation Learning

This work presents a novel objective function for the unsupervised train...

Please sign up or login with your details

Forgot password? Click here to reset