Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction

08/16/2023
by   Eunseop Yoon, et al.
0

Text-to-Text Transfer Transformer (T5) has recently been considered for the Grapheme-to-Phoneme (G2P) transduction. As a follow-up, a tokenizer-free byte-level model based on T5 referred to as ByT5, recently gave promising results on word-level G2P conversion by representing each input character with its corresponding UTF-8 encoding. Although it is generally understood that sentence-level or paragraph-level G2P can improve usability in real-world applications as it is better suited to perform on heteronyms and linking sounds between words, we find that using ByT5 for these scenarios is nontrivial. Since ByT5 operates on the character level, it requires longer decoding steps, which deteriorates the performance due to the exposure bias commonly observed in auto-regressive generation models. This paper shows that the performance of sentence-level and paragraph-level G2P can be improved by mitigating such exposure bias using our proposed loss-based sampling method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2023

Bilevel Scheduled Sampling for Dialogue Generation

Exposure bias poses a common challenge in numerous natural language proc...
research
09/17/2021

Relating Neural Text Degeneration to Exposure Bias

This work focuses on relating two mysteries in neural-based text generat...
research
08/29/2023

Elucidating the Exposure Bias in Diffusion Models

Diffusion models have demonstrated impressive generative capabilities, b...
research
07/04/2023

Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation

Despite the huge progress in myriad generation tasks, pretrained languag...
research
06/18/2019

Scheduled Sampling for Transformers

Scheduled sampling is a technique for avoiding one of the known problems...
research
05/23/2022

Use of Transformer-Based Models for Word-Level Transliteration of the Book of the Dean of Lismore

The Book of the Dean of Lismore (BDL) is a 16th-century Scottish Gaelic ...
research
10/01/2019

Generalization in Generation: A closer look at Exposure Bias

Exposure bias refers to the train-test discrepancy that seemingly arises...

Please sign up or login with your details

Forgot password? Click here to reset