
-
Transformer is All You Need: Multimodal Multitask Learning with a Unified Transformer
We propose UniT, a Unified Transformer model to simultaneously learn the...
read it
-
Open4Business(O4B): An Open Access Dataset for Summarizing Business Documents
A major challenge in fine-tuning deep learning models for automatic summ...
read it
-
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
We introduce a learning-based approach for room navigation using semanti...
read it
-
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
This work proposes a new challenge set for multimodal classification, fo...
read it
-
Are we pretraining it right? Digging deeper into visio-linguistic pretraining
Numerous recent works have proposed pretraining generic visio-linguistic...
read it
-
TextCaps: a Dataset for Image Captioning with Reading Comprehension
Image descriptions can help visually impaired people to quickly understa...
read it
-
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Many visual scenes contain text that carries crucial information, and it...
read it
-
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
In the last year, new models and methods for pretraining and transfer le...
read it
-
Towards VQA Models that can Read
Studies have shown that a dominant class of questions asked by visually ...
read it
-
Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future
In model-based reinforcement learning, the agent interleaves between mod...
read it
-
Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks
Learning when to communicate and doing that effectively is essential in ...
read it
-
CanvasGAN: A simple baseline for text to image generation by incrementally patching a canvas
We propose a new recurrent generative model for generating images from t...
read it
-
Neural Network Acceptability Judgments
In this work, we explore the ability of artificial neural networks to ju...
read it
-
Building a Structured Query Engine
Finding patterns in data and being able to retrieve information from tho...
read it