Yeah, Right, Uh-Huh: A Deep Learning Backchannel Predictor

06/02/2017
by   Robin Ruede, et al.
0

Using supporting backchannel (BC) cues can make human-computer interaction more social. BCs provide a feedback from the listener to the speaker indicating to the speaker that he is still listened to. BCs can be expressed in different ways, depending on the modality of the interaction, for example as gestures or acoustic cues. In this work, we only considered acoustic cues. We are proposing an approach towards detecting BC opportunities based on acoustic input features like power and pitch. While other works in the field rely on the use of a hand-written rule set or specialized features, we made use of artificial neural networks. They are capable of deriving higher order features from input features themselves. In our setup, we first used a fully connected feed-forward network to establish an updated baseline in comparison to our previously proposed setup. We also extended this setup by the use of Long Short-Term Memory (LSTM) networks which have shown to outperform feed-forward based setups on various tasks. Our best system achieved an F1-Score of 0.37 using power and pitch features. Adding linguistic information using word2vec, the score increased to 0.39.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/25/2014

Protein Secondary Structure Prediction with Long Short Term Memory Networks

Prediction of protein secondary structure from the amino acid sequence i...
research
12/29/2015

Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems

We propose a simplified model of attention which is applicable to feed-f...
research
03/15/2020

A model of figure ground organization incorporating local and global cues

Figure Ground Organization (FGO) – inferring spatial depth ordering of o...
research
01/29/2016

Lipreading with Long Short-Term Memory

Lipreading, i.e. speech recognition from visual-only recordings of a spe...
research
11/27/2018

Are 2D-LSTM really dead for offline text recognition?

There is a recent trend in handwritten text recognition with deep neural...
research
09/20/2021

Clustering in Recurrent Neural Networks for Micro-Segmentation using Spending Personality

Customer segmentation has long been a productive field in banking. Howev...
research
04/10/2023

Modeling Speaker-Listener Interaction for Backchannel Prediction

We present our latest findings on backchannel modeling novelly motivated...

Please sign up or login with your details

Forgot password? Click here to reset