Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

03/10/2020
by   Shen Gao, et al.
24

Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps, and some works are dedicated to automatically select sticker response by matching text labels of stickers with previous utterances. However, due to their large quantities, it is impractical to require text labels for the all stickers. Hence, in this paper, we propose to recommend an appropriate sticker to user based on multi-turn dialog context history without any external labels. Two main challenges are confronted in this task. One is to learn semantic meaning of stickers without corresponding text labels. Another challenge is to jointly model the candidate sticker with the multi-turn dialog context. To tackle these challenges, we propose a sticker response selector (SRS) model. Specifically, SRS first employs a convolutional based sticker image encoder and a self-attention based multi-turn dialog encoder to obtain the representation of stickers and utterances. Next, deep interaction network is proposed to conduct deep matching between the sticker with each utterance in the dialog history. SRS then learns the short-term and long-term dependency between all interaction results by a fusion network to output the the final matching score. To evaluate our proposed method, we collect a large-scale real-world dialog dataset with stickers from one of the most popular online chatting platform. Extensive experiments conducted on this dataset show that our model achieves the state-of-the-art performance for all commonly-used metrics. Experiments also verify the effectiveness of each component of SRS. To facilitate further research in sticker selection field, we release this dataset of 340K multi-turn dialog and sticker pairs.

READ FULL TEXT

page 9

page 10

research
01/09/2019

Sequential Attention-based Network for Noetic End-to-End Response Selection

The noetic end-to-end response selection challenge as one track in Dialo...
research
03/03/2020

Sequential Neural Networks for Noetic End-to-End Response Selection

The noetic end-to-end response selection challenge as one track in the 7...
research
11/16/2019

Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots

This paper proposes an utterance-to-utterance interactive matching netwo...
research
09/10/2020

Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection

In this paper, we study the task of selecting optimal response given use...
research
09/03/2021

A Context-Aware Hierarchical BERT Fusion Network for Multi-turn Dialog Act Detection

The success of interactive dialog systems is usually associated with the...
research
04/04/2019

Topic Spotting using Hierarchical Networks with Self Attention

Success of deep learning techniques have renewed the interest in develop...
research
05/24/2021

Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation

GuessWhat?! is a two-player visual dialog guessing game where player A a...

Please sign up or login with your details

Forgot password? Click here to reset