Key to tasks that require reasoning about natural language in visual con...
Differentiable Search Index is a recently proposed paradigm for document...
We study continually improving an extractive question answering (QA) sys...
A long tradition of studies in psycholinguistics has examined the format...
CB2 is a multi-agent platform to study collaborative natural language
in...
We study the problem of continually training an instruction-following ag...
We introduce KiloGram, a resource for studying abstract visual reasoning...
We present lilGym, a new benchmark for language-conditioned reinforcemen...
We introduce Wav2Seq, the first self-supervised approach to pre-train bo...
We study learning from user feedback for extractive question answering b...
Progress in speech processing has been facilitated by shared datasets an...
We introduce Classification with Alternating Normalization (CAN), a
non-...
This paper is a study of performance-efficiency trade-offs in pre-traine...
We analyze language change over time in a collaborative, goal-oriented
i...
We present a task and benchmark dataset for person-centric visual ground...
We study continual learning for natural language instruction generation,...
Natural language provides an accessible and expressive interface to spec...
We study the problem of learning a robot policy to follow natural langua...
We study the problem of few-sample fine-tuning of BERT contextual
repres...
Visual features are a promising signal for learning bootstrap textual mo...
Standard test sets for supervised learning evaluate in-distribution
gene...
The Touchdown dataset (Chen et al., 2019) provides instructions by human...
Natural language systems often rely on a single, potentially ambiguous i...
We propose a joint simulation and real-world learning framework for mapp...
We study a collaborative scenario where a user not only instructs a syst...
NLVR2 (Suhr et al., 2019) was designed to be robust for language bias th...
We propose BERTScore, an automatic evaluation metric for text generation...
We study the problem of jointly reasoning about language and vision thro...
Increasingly, perceptual systems are being codified as strict pipelines
...
We propose an approach for mapping natural language instructions and raw...
We introduce a new dataset for joint reasoning about language and vision...
We propose to decompose instruction execution to goal prediction and act...
We introduce a method for following high-level navigation instructions b...
We propose a learning approach for mapping context-dependent sequential
...
We present NEWSROOM, a summarization dataset of 1.3 million articles and...
We propose a context-dependent model to map utterances within an interac...
We present CHALET, a 3D house simulator with support for navigation and
...
Natural language provides a widely accessible and expressive interface f...
Common recurrent neural network architectures scale poorly due to the
in...
We propose to directly map raw visual observations and text input to act...