A Study of Different Ways to Use The Conformer Model For Spoken Language Understanding

04/08/2022
by   Nick J. C. Wang, et al.
0

SLU combines ASR and NLU capabilities to accomplish speech-to-intent understanding. In this paper, we compare different ways to combine ASR and NLU, in particular using a single Conformer model with different ways to use its components, to better understand the strengths and weaknesses of each approach. We find that it is not necessarily a choice between two-stage decoding and end-to-end systems which determines the best system for research or application. System optimization still entails carefully improving the performance of each component. It is difficult to prove that one direction is conclusively better than the other. In this paper, we also propose a novel connectionist temporal summarization (CTS) method to reduce the length of acoustic encoding sequences while improving the accuracy and processing speed of end-to-end models. This method achieves the same intent accuracy as the best two-stage SLU recognition with complicated and time-consuming decoding but does so at lower computational cost. This stacked end-to-end SLU system yields an intent accuracy of 93.97 close-field set, and 99.71

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2019

Speech Model Pre-training for End-to-End Spoken Language Understanding

Whereas conventional spoken language understanding (SLU) systems map spe...
research
08/05/2020

Improving End-to-End Speech-to-Intent Classification with Reptile

End-to-end spoken language understanding (SLU) systems have many advanta...
research
04/07/2022

Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model

In spoken language understanding (SLU), what the user says is converted ...
research
04/07/2021

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs

A major focus of recent research in spoken language understanding (SLU) ...
research
10/23/2019

Incremental Online Spoken Language Understanding

Spoken Language Understanding (SLU) typically comprises of an automatic ...
research
06/08/2021

Sequential End-to-End Intent and Slot Label Classification and Localization

Human-computer interaction (HCI) is significantly impacted by delayed re...
research
05/14/2023

Improving End-to-End SLU performance with Prosodic Attention and Distillation

Most End-to-End SLU methods depend on the pretrained ASR or language mod...

Please sign up or login with your details

Forgot password? Click here to reset