Better Text Understanding Through Image-To-Text Transfer

05/23/2017
by   Karol Kurach, et al.
0

Generic text embeddings are successfully used in a variety of tasks. However, they are often learnt by capturing the co-occurrence structure from pure text corpora, resulting in limitations of their ability to generalize. In this paper, we explore models that incorporate visual information into the text representation. Based on comprehensive ablation studies, we propose a conceptually simple, yet well performing architecture. It outperforms previous multimodal approaches on a set of well established benchmarks. We also improve the state-of-the-art results for image-related text datasets, using orders of magnitude less data.

READ FULL TEXT

page 1

page 4

research
08/20/2018

Learning to Learn from Web Data through Deep Semantic Embeddings

In this paper we propose to learn a multimodal image and text embedding ...
research
01/07/2019

Self-Supervised Learning from Web Data for Multimodal Retrieval

Self-Supervised learning from multimodal image and text data allows deep...
research
10/15/2018

Deep Transfer Reinforcement Learning for Text Summarization

Deep neural networks are data hungry models and thus they face difficult...
research
08/20/2021

Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling

As an important task in multimodal context understanding, Text-VQA (Visu...
research
08/15/2023

MultiSChuBERT: Effective Multimodal Fusion for Scholarly Document Quality Prediction

Automatic assessment of the quality of scholarly documents is a difficul...
research
11/13/2019

Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis

Multimodal language analysis often considers relationships between featu...
research
02/22/2021

Probing Multimodal Embeddings for Linguistic Properties: the Visual-Semantic Case

Semantic embeddings have advanced the state of the art for countless nat...

Please sign up or login with your details

Forgot password? Click here to reset