An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks

03/31/2022
by   Kai-Wei Chang, et al.
0

Speech representations learned from Self-supervised learning (SSL) models have been found beneficial for various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific downstream models and loss functions, causing much memory usage and human labor. On the other hand, prompting in Natural Language Processing (NLP) is an efficient and widely used technique to leverage pre-trained language models (LMs). Nevertheless, such a paradigm is little studied in the speech community. We report in this paper the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM). Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models. We further study the technique in challenging sequence generation tasks. Prompt tuning also demonstrates its potential, while the limitation and possible research directions are discussed in this paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2022

Exploring Efficient-tuning Methods in Self-supervised Speech Models

In this study, we aim to explore efficient tuning methods for speech sel...
research
03/01/2023

SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks

Prompt tuning is a technology that tunes a small set of parameters to st...
research
12/01/2022

CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models

Self-supervised learning (SSL) is a powerful technique for learning repr...
research
06/13/2023

Efficient Adapters for Giant Speech Models

Large pre-trained speech models are widely used as the de-facto paradigm...
research
03/25/2022

Striking a Balance: Alleviating Inconsistency in Pre-trained Models for Symmetric Classification Tasks

While fine-tuning pre-trained models for downstream classification is th...
research
10/01/2022

Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications

Speech quality in online conferencing applications is typically assessed...
research
06/03/2023

SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts

Large language models (LLMs) have gained considerable attention for Arti...

Please sign up or login with your details

Forgot password? Click here to reset