Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection

06/09/2023
by   Chenglong Wang, et al.
0

Self-supervised speech models are a rapidly developing research topic in fake audio detection. Many pre-trained models can serve as feature extractors, learning richer and higher-level speech features. However,when fine-tuning pre-trained models, there is often a challenge of excessively long training times and high memory consumption, and complete fine-tuning is also very expensive. To alleviate this problem, we apply low-rank adaptation(LoRA) to the wav2vec2 model, freezing the pre-trained model weights and injecting a trainable rank-decomposition matrix into each layer of the transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared with fine-tuning with Adam on the wav2vec2 model containing 317M training parameters, LoRA achieved similar performance by reducing the number of trainable parameters by 198 times.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2021

LoRA: Low-Rank Adaptation of Large Language Models

The dominant paradigm of natural language processing consists of large-s...
research
07/11/2023

My3DGen: Building Lightweight Personalized 3D Generative Model

Our paper presents My3DGen, a practical system for creating a personaliz...
research
11/04/2022

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

The advent of hyper-scale and general-purpose pre-trained models is shif...
research
10/14/2022

DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation

With the ever-growing size of pre-trained models (PMs), fine-tuning them...
research
06/16/2023

Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions

Pre-training fine-tuning is a prevalent paradigm in computer vision ...
research
10/01/2022

Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications

Speech quality in online conferencing applications is typically assessed...
research
12/01/2022

CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models

Self-supervised learning (SSL) is a powerful technique for learning repr...

Please sign up or login with your details

Forgot password? Click here to reset