Uncertainty Estimates for Efficient Neural Network-based Dialogue Policy Optimisation

11/30/2017
by   Christopher Tegho, et al.
0

In statistical dialogue management, the dialogue manager learns a policy that maps a belief state to an action for the system to perform. Efficient exploration is key to successful policy optimisation. Current deep reinforcement learning methods are very promising but rely on epsilon-greedy exploration, thus subjecting the user to a random choice of action during learning. Alternative approaches such as Gaussian Process SARSA (GPSARSA) estimate uncertainties and are sample efficient, leading to better user experience, but on the expense of a greater computational complexity. This paper examines approaches to extract uncertainty estimates from deep Q-networks (DQN) in the context of dialogue management. We perform an extensive benchmark of deep Bayesian methods to extract uncertainty estimates, namely Bayes-By-Backprop, dropout, its concrete variation, bootstrapped ensemble and alpha-divergences, combining it with DQN algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2020

Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management

The task-oriented spoken dialogue system (SDS) aims to assist a human us...
research
09/09/2021

Uncertainty Measures in Neural Belief Tracking and the Effects on Dialogue Policy Performance

The ability to identify and resolve uncertainty is crucial for the robus...
research
11/15/2017

BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

We present a new algorithm that significantly improves the efficiency of...
research
02/11/2018

Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces

In spoken dialogue systems, we aim to deploy artificial intelligence to ...
research
05/05/2023

Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization

Training a dialogue policy using deep reinforcement learning requires a ...
research
03/08/2018

Feudal Reinforcement Learning for Dialogue Management in Large Domains

Reinforcement learning (RL) is a promising approach to solve dialogue po...
research
07/24/2022

Anti-Overestimation Dialogue Policy Learning for Task-Completion Dialogue System

A dialogue policy module is an essential part of task-completion dialogu...

Please sign up or login with your details

Forgot password? Click here to reset