Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

06/08/2023
by   Zhiyi Wang, et al.
0

This work introduces approaches to assessing phrase breaks in ESL learners' speech using pre-trained language models (PLMs) and large language models (LLMs). There are two tasks: overall assessment of phrase break for a speech clip and fine-grained assessment of every possible phrase break position. To leverage NLP models, speech input is first force-aligned with texts, and then pre-processed into a token sequence, including words and phrase break information. To utilize PLMs, we propose a pre-training and fine-tuning pipeline with the processed tokens. This process includes pre-training with a replaced break token detection module and fine-tuning with text classification and sequence labeling. To employ LLMs, we design prompts for ChatGPT. The experiments show that with the PLMs, the dependence on labeled training data has been greatly reduced, and the performance has improved. Meanwhile, we verify that ChatGPT, a renowned LLM, has potential for further advancement in this area.

READ FULL TEXT

page 2

page 3

research
10/28/2022

Assessing Phrase Break of ESL speech with Pre-trained Language Models

This work introduces an approach to assessing phrase break in ESL learne...
research
09/08/2021

On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets

Pre-training language models (LMs) on large-scale unlabeled text data ma...
research
08/29/2021

Span Fine-tuning for Pre-trained Language Models

Pre-trained language models (PrLM) have to carefully manage input units ...
research
05/11/2023

PROM: A Phrase-level Copying Mechanism with Pre-training for Abstractive Summarization

Based on the remarkable achievements of pre-trained language models in a...
research
03/10/2023

Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models

Pre-trained Programming Language Models (PPLMs) achieved many recent sta...
research
02/01/2022

A Flexible Clustering Pipeline for Mining Text Intentions

Mining the latent intentions from large volumes of natural language inpu...
research
08/05/2022

Construction of English Resume Corpus and Test with Pre-trained Language Models

Information extraction(IE) has always been one of the essential tasks of...

Please sign up or login with your details

Forgot password? Click here to reset