Submodularity-inspired Data Selection for Goal-oriented Chatbot Training based on Sentence Embeddings

02/02/2018
by   Mladen Dimovski, et al.
0

Goal-oriented (GO) dialogue systems rely on an initial natural language understanding (NLU) module to determine the user's intention and parameters thereof - also known as slots. Since the systems, also known as bots, help the users with solving problems in relatively narrow domains, they require training data within those domains. This leads to significant data availability issues that inhibit the development of successful bots. To alleviate this problem, we propose a technique of data selection in the low-data regime that allows training with significantly fewer labeled sentences, thus smaller labelling costs. We create a submodularity-inspired data ranking function, the ratio penalty marginal gain, to select data points to label based solely on the information extracted from the textual embedding space. We show that the distances in the embedding space are a viable source of information for data selection. This method outperforms several known active learning techniques, without using the label information. This allows for cost-efficient training of NLU units for goal-oriented bots. Moreover, our proposed selection technique does not need the retraining of the model in between the selection steps, making it time-efficient as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2018

Building Advanced Dialogue Managers for Goal-Oriented Dialogue Systems

Goal-Oriented (GO) Dialogue Systems, colloquially known as goal oriented...
research
09/06/2022

Efficient search of active inference policy spaces using k-means

We develop an approach to policy selection in active inference that allo...
research
02/01/2018

Goal-Oriented Chatbot Dialog Management Bootstrapping with Transfer Learning

Goal-Oriented (GO) Dialogue Systems, colloquially known as goal oriented...
research
07/03/2018

Intent Generation for Goal-Oriented Dialogue Systems based on Schema.org Annotations

Goal-oriented dialogue systems typically communicate with a backend (e.g...
research
08/22/2022

Dialogue Term Extraction using Transfer Learning and Topological Data Analysis

Goal oriented dialogue systems were originally designed as a natural lan...
research
01/25/2023

Probing Taxonomic and Thematic Embeddings for Taxonomic Information

Modelling taxonomic and thematic relatedness is important for building A...

Please sign up or login with your details

Forgot password? Click here to reset