Temporal Knowledge Distillation for On-device Audio Classification

10/27/2021
by   Kwanghee Choi, et al.
0

Improving the performance of on-device audio classification models remains a challenge given the computational limits of the mobile environment. Many studies leverage knowledge distillation to boost predictive performance by transferring the knowledge from large models to on-device models. However, most lack the essence of the temporal information which is crucial to audio classification tasks, or similar architecture is often required. In this paper, we propose a new knowledge distillation method designed to incorporate the temporal knowledge embedded in attention weights of large models to on-device models. Our distillation method is applicable to various types of architectures, including the non-attention-based architectures such as CNNs or RNNs, without any architectural change during inference. Through extensive experiments on both an audio event detection dataset and a noisy keyword spotting dataset, we show that our proposed method improves the predictive performance across diverse on-device architectures.

READ FULL TEXT
research
07/06/2023

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation

Large self-supervised models are effective feature extractors, but their...
research
02/22/2020

Multi-Representation Knowledge Distillation For Audio Classification

As an important component of multimedia analysis tasks, audio classifica...
research
10/29/2022

Application of Knowledge Distillation to Multi-task Speech Representation Learning

Model architectures such as wav2vec 2.0 and HuBERT have been proposed to...
research
11/11/2020

Efficient Knowledge Distillation for RNN-Transducer Models

Knowledge Distillation is an effective method of transferring knowledge ...
research
03/30/2022

Device-Directed Speech Detection: Regularization via Distillation for Weakly-Supervised Models

We address the problem of detecting speech directed to a device that doe...
research
06/28/2022

QTI Submission to DCASE 2021: residual normalization for device-imbalanced acoustic scene classification with efficient design

This technical report describes the details of our TASK1A submission of ...
research
05/26/2023

ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression

Noise suppression (NS) models have been widely applied to enhance speech...

Please sign up or login with your details

Forgot password? Click here to reset