Multi-Task Self-Supervised Learning for Disfluency Detection

08/15/2019
by   Shaolei Wang, et al.
0

Most existing approaches to disfluency detection heavily rely on human-annotated data, which is expensive to obtain in practice. To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasks-i.e., supervised tasks where data can be collected without manual labeling. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added noisy words. (ii) sentence classification to distinguish original sentences from grammatically-incorrect sentences. We then combine these two tasks to jointly train a network. The pre-trained network is then fine-tuned using human-annotated disfluency detection training data. Experimental results on the commonly used English Switchboard test set show that our approach can achieve competitive performance compared to the previous systems (trained using the full dataset) by using less than 1 method trained on the full dataset significantly outperforms previous methods, reducing the error by 21

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2020

Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection

Most existing approaches to disfluency detection heavily rely on human-a...
research
08/25/2017

Multi-task Self-Supervised Visual Learning

We investigate methods for combining multiple self-supervised tasks--i.e...
research
09/05/2023

Self-Supervised Pre-Training Boosts Semantic Scene Segmentation on LiDAR data

Airborne LiDAR systems have the capability to capture the Earth's surfac...
research
02/03/2021

Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks

Deep learning (DL) techniques are gaining more and more attention in the...
research
07/30/2020

Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization

Supervised approaches for Neural Abstractive Summarization require large...
research
06/06/2016

Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution

Most existing approaches for zero pronoun resolution are heavily relying...
research
06/21/2022

HealNet – Self-Supervised Acute Wound Heal-Stage Classification

Identifying, tracking, and predicting wound heal-stage progression is a ...

Please sign up or login with your details

Forgot password? Click here to reset