Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification

08/05/2021
by   Yidi Jiang, et al.
0

End-to-end intent classification using speech has numerous advantages compared to the conventional pipeline approach using automatic speech recognition (ASR), followed by natural language processing modules. It attempts to predict intent from speech without using an intermediate ASR module. However, such end-to-end framework suffers from the unavailability of large speech resources with higher acoustic variation in spoken language understanding. In this work, we exploit the scope of the transformer distillation method that is specifically designed for knowledge distillation from a transformer based language model to a transformer based speech model. In this regard, we leverage the reliable and widely used bidirectional encoder representations from transformers (BERT) model as a language model and transfer the knowledge to build an acoustic model for intent classification using the speech. In particular, a multilevel transformer based teacher-student model is designed, and knowledge distillation is performed across attention and hidden sub-layers of different transformer layers of the student and teacher models. We achieve an intent classification accuracy of 99.10 speech corpus and ATIS database, respectively. Further, the proposed method demonstrates better performance and robustness in acoustically degraded condition compared to the baseline method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2021

Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification

Intent classification is a task in spoken language understanding. An int...
research
09/28/2021

Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification

End-to-end speech-to-intent classification has shown its advantage in ha...
research
11/28/2022

Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition

Recently, the advance in deep learning has brought a considerable improv...
research
05/14/2023

Improving End-to-End SLU performance with Prosodic Attention and Distillation

Most End-to-End SLU methods depend on the pretrained ASR or language mod...
research
04/22/2023

Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices

Toxicity is a prevalent social behavior that involves the use of hate sp...
research
09/17/2023

A Few-Shot Approach to Dysarthric Speech Intelligibility Level Classification Using Transformers

Dysarthria is a speech disorder that hinders communication due to diffic...
research
08/09/2020

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR

Attention-based sequence-to-sequence (seq2seq) models have achieved prom...

Please sign up or login with your details

Forgot password? Click here to reset