L2 proficiency assessment using self-supervised speech representations

11/16/2022
by   Stefano Bannò, et al.
0

There has been a growing demand for automated spoken language assessment systems in recent years. A standard pipeline for this process is to start with a speech recognition system and derive features, either hand-crafted or based on deep-learning, that exploit the transcription and audio. Though these approaches can yield high performance systems, they require speech recognition systems that can be used for L2 speakers, and preferably tuned to the specific form of test being deployed. Recently a self-supervised speech representation based scheme, requiring no speech recognition, was proposed. This work extends the initial analysis conducted on this approach to a large scale proficiency test, Linguaskill, that comprises multiple parts, each designed to assess different attributes of a candidate's speaking proficiency. The performance of the self-supervised, wav2vec 2.0, system is compared to a high performance hand-crafted assessment system and a BERT-based text system both of which use speech transcriptions. Though the wav2vec 2.0 based system is found to be sensitive to the nature of the response, it can be configured to yield comparable performance to systems requiring a speech transcription, and yields gains when appropriately combined with standard approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2022

Towards End-to-end Unsupervised Speech Recognition

Unsupervised speech recognition has shown great potential to make Automa...
research
06/09/2020

Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer

In this paper, we seek to reduce the computation complexity of transform...
research
10/07/2021

Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models

Code-switching (CS) is common in daily conversations where more than one...
research
08/19/2022

3M: An Effective Multi-view, Multi-granularity, and Multi-aspect Modeling Approach to English Pronunciation Assessment

As an indispensable ingredient of computer-assisted pronunciation traini...
research
02/03/2022

Self-supervised Learning with Random-projection Quantizer for Speech Recognition

We present a simple and effective self-supervised learning approach for ...
research
10/24/2022

Proficiency assessment of L2 spoken English using wav2vec 2.0

The increasing demand for learning English as a second language has led ...
research
04/01/2018

Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator

User-machine interaction is crucial for information retrieval, especiall...

Please sign up or login with your details

Forgot password? Click here to reset