Assessing Phrase Break of ESL speech with Pre-trained Language Models

10/28/2022
by   Zhiyi Wang, et al.
0

This work introduces an approach to assessing phrase break in ESL learners' speech with pre-trained language models (PLMs). Different with traditional methods, this proposal converts speech to token sequences, and then leverages the power of PLMs. There are two sub-tasks: overall assessment of phrase break for a speech clip; fine-grained assessment of every possible phrase break position. Speech input is first force-aligned with texts, then pre-processed to a token sequence, including words and associated phrase break information. The token sequence is then fed into the pre-training and fine-tuning pipeline. In pre-training, a replaced break token detection module is trained with token data where each token has a certain percentage chance to be randomly replaced. In fine-tuning, overall and fine-grained scoring are optimized with text classification and sequence labeling pipeline, respectively. With the introduction of PLMs, the dependence on labeled training data has been greatly reduced, and performance has improved.

READ FULL TEXT

page 2

page 3

research
06/08/2023

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

This work introduces approaches to assessing phrase breaks in ESL learne...
research
03/07/2022

Pre-trained Token-replaced Detection Model as Few-shot Learner

Pre-trained masked language models have demonstrated remarkable ability ...
research
09/05/2022

Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples

Recent advances in the development of large language models have resulte...
research
03/09/2023

Replacement as a Self-supervision for Fine-grained Vision-language Pre-training

Fine-grained supervision based on object annotations has been widely use...
research
01/30/2022

Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection

Nowadays, most methods in end-to-end contextual speech recognition bias ...
research
06/04/2021

Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene

The major paradigm of applying a pre-trained language model to downstrea...

Please sign up or login with your details

Forgot password? Click here to reset