Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis

12/01/2022
by   Odysseas S. Chlapanis, et al.
0

Multimodal learning pipelines have benefited from the success of pretrained language models. However, this comes at the cost of increased model parameters. In this work, we propose Adapted Multimodal BERT (AMB), a BERT-based architecture for multimodal tasks that uses a combination of adapter modules and intermediate fusion layers. The adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations. During the adaptation process the pre-trained language model parameters remain frozen, allowing for fast, parameter-efficient training. In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise. Our experiments on sentiment analysis with CMU-MOSEI show that AMB outperforms the current state-of-the-art across metrics, with 3.4 resulting error and 2.1 accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2019

What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning

Pretrained transformer-based language models have achieved state of the ...
research
09/30/2022

Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification

Language Models pretrained on large textual data have been shown to enco...
research
08/13/2019

Variational Fusion for Multimodal Sentiment Analysis

Multimodal fusion is considered a key step in multimodal tasks such as s...
research
02/12/2020

Utilizing BERT Intermediate Layers for Aspect Based Sentiment Analysis and Natural Language Inference

Aspect based sentiment analysis aims to identify the sentimental tendenc...
research
03/27/2023

TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models

Pre-trained large language models have recently achieved ground-breaking...
research
08/03/2021

Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation

Multimodal target/aspect sentiment classification combines multimodal se...
research
06/23/2021

Classifying Textual Data with Pre-trained Vision Models through Transfer Learning and Data Transformations

Knowledge is acquired by humans through experience, and no boundary is s...

Please sign up or login with your details

Forgot password? Click here to reset