Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation

11/08/2022
by   Satwik Kottur, et al.
0

People capture photos and videos to relive and share memories of personal significance. Recently, media montages (stories) have become a popular mode of sharing these memories due to their intuitive and powerful storytelling capabilities. However, creating such montages usually involves a lot of manual searches, clicks, and selections that are time-consuming and cumbersome, adversely affecting user experiences. To alleviate this, we propose task-oriented dialogs for montage creation as a novel interactive tool to seamlessly search, compile, and edit montages from a media collection. To the best of our knowledge, our work is the first to leverage multi-turn conversations for such a challenging application, extending the previous literature studying simple media retrieval tasks. We collect a new dataset C3 (Conversational Content Creation), comprising 10k dialogs conditioned on media montages simulated from a large media collection. We take a simulate-and-paraphrase approach to collect these dialogs to be both cost and time efficient, while drawing from natural language distribution. Our analysis and benchmarking of state-of-the-art language models showcase the multimodal challenges present in the dataset. Lastly, we present a real-world mobile demo application that shows the feasibility of the proposed work in real-world applications. Our code and data will be made publicly available.

READ FULL TEXT

page 1

page 3

page 8

research
11/15/2022

Navigating Connected Memories with a Task-oriented Dialog System

Recent years have seen an increasing trend in the volume of personal med...
research
10/20/2020

Simulated Chats for Task-oriented Dialog: Learning to Generate Conversations from Instructions

Popular task-oriented dialog data sets such as MultiWOZ (Budzianowski et...
research
04/18/2021

SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

We present a new corpus for the Situated and Interactive Multimodal Conv...
research
02/16/2016

Contextual Media Retrieval Using Natural Language Queries

The widespread integration of cameras in hand-held and head-worn devices...
research
09/07/2023

Large Language Models as Optimizers

Optimization is ubiquitous. While derivative-based algorithms have been ...
research
09/21/2023

LLMR: Real-time Prompting of Interactive Worlds using Large Language Models

We present Large Language Model for Mixed Reality (LLMR), a framework fo...
research
09/02/2019

Story-oriented Image Selection and Placement

Multimodal contents have become commonplace on the Internet today, manif...

Please sign up or login with your details

Forgot password? Click here to reset