Joint Generation of Captions and Subtitles with Dual Decoding

05/13/2022
by   Jitao Xu, et al.
0

As the amount of audio-visual content increases, the need to develop automatic captioning and subtitling solutions to match the expectations of a growing international audience appears as the only viable way to boost throughput and lower the related post-production costs. Automatic captioning and subtitling often need to be tightly intertwined to achieve an appropriate level of consistency and synchronization with each other and with the video signal. In this work, we assess a dual decoding scheme to achieve a strong coupling between these two tasks and show how adequacy and consistency are increased, with virtually no additional cost in terms of model size and training complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2021

Between Flexibility and Consistency: Joint Generation of Captions and Subtitles

Speech translation (ST) has lately received growing interest for the gen...
research
07/07/2022

Dual-Stream Transformer for Generic Event Boundary Captioning

This paper describes our champion solution for the CVPR2022 Generic Even...
research
10/21/2019

Clotho: An Audio Captioning Dataset

Audio captioning is the novel task of general audio content description ...
research
03/08/2020

Better Captioning with Sequence-Level Exploration

Sequence-level learning objective has been widely used in captioning tas...
research
08/17/2021

End-to-End Dense Video Captioning with Parallel Decoding

Dense video captioning aims to generate multiple associated captions wit...
research
11/14/2022

Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates

Automatic Audio Captioning (AAC) is the task that aims to describe an au...

Please sign up or login with your details

Forgot password? Click here to reset