Unsupervised Spoken Utterance Classification

07/02/2021
by   Shahab Jalalvand, et al.
0

An intelligent virtual assistant (IVA) enables effortless conversations in call routing through spoken utterance classification (SUC) which is a special form of spoken language understanding (SLU). Building a SUC system requires a large amount of supervised in-domain data that is not always available. In this paper, we introduce an unsupervised spoken utterance classification approach (USUC) that does not require any in-domain data except for the intent labels and a few para-phrases per intent. USUC is consisting of a KNN classifier (K=1) and a complex embedding model trained on a large amount of unsupervised customer service corpus. Among all embedding models, we demonstrate that Elmo works best for USUC. However, an Elmo model is too slow to be used at run-time for call routing. To resolve this issue, first, we compute the uni- and bi-gram embedding vectors offline and we build a lookup table of n-grams and their corresponding embedding vector. Then we use this table to compute sentence embedding vectors at run-time, along with back-off techniques for unseen n-grams. Experiments show that USUC outperforms the traditional utterance classification methods by reducing the classification error rate from 32.9 27.0 technique increases the processing speed from 16 utterances per second to 118 utterances per second.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/27/2018

Automatic Data Expansion for Customer-care Spoken Language Understanding

Spoken language understanding (SLU) systems are widely used in handling ...
12/22/2021

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Current researches on spoken language understanding (SLU) heavily are li...
10/23/2019

Incremental Online Spoken Language Understanding

Spoken Language Understanding (SLU) typically comprises of an automatic ...
08/07/2018

Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection

While Word2Vec represents words (in text) as vectors carrying semantic i...
05/14/2019

Strong and Simple Baselines for Multimodal Utterance Embeddings

Human language is a rich multimodal signal consisting of spoken words, f...
09/17/2020

Utterance-level Intent Recognition from Keywords

This paper focuses on wake on intent (WOI) techniques for platforms with...
04/27/2021

Semi-supervised Interactive Intent Labeling

Building the Natural Language Understanding (NLU) modules of task-orient...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.