TCT: A Cross-supervised Learning Method for Multimodal Sequence Representation

10/23/2019
by   Wubo Li, et al.
0

Multimodalities provide promising performance than unimodality in most tasks. However, learning the semantic of the representations from multimodalities efficiently is extremely challenging. To tackle this, we propose the Transformer based Cross-modal Translator (TCT) to learn unimodal sequence representations by translating from other related multimodal sequences on a supervised learning method. Combined TCT with Multimodal Transformer Network (MTN), we evaluate MTN-TCT on the video-grounded dialogue which uses multimodality. The proposed method reports new state-of-the-art performance on video-grounded dialogue which indicates representations learned by TCT are more semantics compared to directly use unimodality.

READ FULL TEXT
research
07/02/2019

Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems

Developing Video-Grounded Dialogue Systems (VGDS), where a dialogue is c...
research
10/21/2020

TMT: A Transformer-based Modal Translator for Improving Multimodal Sequence Representations in Audio Visual Scene-aware Dialog

Audio Visual Scene-aware Dialog (AVSD) is a task to generate responses w...
research
07/31/2023

Latent Masking for Multimodal Self-supervised Learning in Health Timeseries

Limited availability of labeled data for machine learning on biomedical ...
research
06/16/2022

Multimodal Dialogue State Tracking

Designed for tracking user goals in dialogues, a dialogue state tracker ...
research
10/15/2021

StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data

This paper tackles the problem of processing and combining efficiently a...
research
10/31/2022

Multimodal Information Bottleneck: Learning Minimal Sufficient Unimodal and Multimodal Representations

Learning effective joint embedding for cross-modal data has always been ...
research
03/15/2017

End-to-end optimization of goal-driven and visually grounded dialogue systems

End-to-end design of dialogue systems has recently become a popular rese...

Please sign up or login with your details

Forgot password? Click here to reset