Self-Supervised Multimodal Opinion Summarization

05/27/2021
by   Jinbae Im, et al.
0

Recently, opinion summarization, which is the generation of a summary from multiple reviews, has been conducted in a self-supervised manner by considering a sampled review as a pseudo summary. However, non-text data such as image and metadata related to reviews have been considered less often. To use the abundant information contained in non-text data, we propose a self-supervised multimodal opinion summarization framework called MultimodalSum. Our framework obtains a representation of each modality using a separate encoder for each modality, and the text decoder generates a summary. To resolve the inherent heterogeneity of multimodal data, we propose a multimodal training pipeline. We first pretrain the text encoder–decoder based solely on text modality data. Subsequently, we pretrain the non-text modality encoders by considering the pretrained text decoder as a pivot for the homogeneous representation of multimodal data. Finally, to fuse multimodal representations, we train the entire framework in an end-to-end manner. We demonstrate the superiority of MultimodalSum by conducting experiments on Yelp and Amazon datasets.

READ FULL TEXT

page 8

page 15

research
04/03/2021

Convex Aggregation for Opinion Summarization

Recent approaches for unsupervised opinion summarization have predominan...
research
03/27/2018

Deep Communicating Agents for Abstractive Summarization

We present deep communicating agents in an encoder-decoder architecture ...
research
01/07/2022

Video Summarization Based on Video-text Modelling

Modern video summarization methods are based on deep neural networks whi...
research
06/25/2021

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

While pretrained encoders have achieved success in various natural langu...
research
09/15/2022

Unsupervised Opinion Summarization Using Approximate Geodesics

Opinion summarization is the task of creating summaries capturing popula...
research
05/21/2023

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

The convergence of text, visual, and audio data is a key step towards hu...
research
10/03/2022

Probing of Quantitative Values in Abstractive Summarization Models

Abstractive text summarization has recently become a popular approach, b...

Please sign up or login with your details

Forgot password? Click here to reset