Discriminative Self-training for Punctuation Prediction

04/21/2021
by   Qian Chen, et al.
0

Punctuation prediction for automatic speech recognition (ASR) output transcripts plays a crucial role for improving the readability of the ASR transcripts and for improving the performance of downstream natural language processing applications. However, achieving good performance on punctuation prediction often requires large amounts of labeled speech transcripts, which is expensive and laborious. In this paper, we propose a Discriminative Self-Training approach with weighted loss and discriminative label smoothing to exploit unlabeled speech transcripts. Experimental results on the English IWSLT2011 benchmark test set and an internal Chinese spoken language dataset demonstrate that the proposed approach achieves significant improvement on punctuation prediction accuracy over strong baselines including BERT, RoBERTa, and ELECTRA models. The proposed Discriminative Self-Training approach outperforms the vanilla self-training approach. We establish a new state-of-the-art (SOTA) on the IWSLT2011 test set, outperforming the current SOTA model by 1.3

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2019

Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging

In recent years, studies on automatic speech recognition (ASR) have show...
research
07/20/2021

Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation

Transcripts generated by automatic speech recognition (ASR) systems for ...
research
02/02/2022

RescoreBERT: Discriminative Speech Recognition Rescoring with BERT

Second-pass rescoring is an important component in automatic speech reco...
research
04/23/2021

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

Self-Supervised Learning (SSL) using huge unlabeled data has been succes...
research
11/08/2022

Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors

Performance of spoken language understanding (SLU) can be degraded with ...
research
03/03/2020

Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection

With the increased applications of automatic speech recognition (ASR) in...
research
07/29/2020

DNN No-Reference PSTN Speech Quality Prediction

Classic public switched telephone networks (PSTN) are often a black box ...

Please sign up or login with your details

Forgot password? Click here to reset