
-
Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following
We study the problem of learning a robot policy to follow natural langua...
read it
-
Revisiting Few-sample BERT Fine-tuning
We study the problem of few-sample fine-tuning of BERT contextual repres...
read it
-
What is Learned in Visually Grounded Neural Syntax Acquisition
Visual features are a promising signal for learning bootstrap textual mo...
read it
-
Evaluating NLP Models via Contrast Sets
Standard test sets for supervised learning evaluate in-distribution gene...
read it
-
Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View
The Touchdown dataset (Chen et al., 2019) provides instructions by human...
read it
-
Interactive Classification by Asking Informative Questions
Natural language systems often rely on a single, potentially ambiguous i...
read it
-
Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight
We propose a joint simulation and real-world learning framework for mapp...
read it
-
Executing Instructions in Situated Collaborative Interactions
We study a collaborative scenario where a user not only instructs a syst...
read it
-
NLVR2 Visual Bias Analysis
NLVR2 (Suhr et al., 2019) was designed to be robust for language bias th...
read it
-
BERTScore: Evaluating Text Generation with BERT
We propose BERTScore, an automatic evaluation metric for text generation...
read it
-
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
We study the problem of jointly reasoning about language and vision thro...
read it
-
Early Fusion for Goal Directed Robotic Vision
Increasingly, perceptual systems are being codified as strict pipelines ...
read it
-
Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction
We propose an approach for mapping natural language instructions and raw...
read it
-
A Corpus for Reasoning About Natural Language Grounded in Photographs
We introduce a new dataset for joint reasoning about language and vision...
read it
-
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
We propose to decompose instruction execution to goal prediction and act...
read it
-
Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning
We introduce a method for following high-level navigation instructions b...
read it
-
Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation
We propose a learning approach for mapping context-dependent sequential ...
read it
-
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
We present NEWSROOM, a summarization dataset of 1.3 million articles and...
read it
-
Learning to Map Context-Dependent Sentences to Executable Formal Queries
We propose a context-dependent model to map utterances within an interac...
read it
-
CHALET: Cornell House Agent Learning Environment
We present CHALET, a 3D house simulator with support for navigation and ...
read it
-
Visual Reasoning with Natural Language
Natural language provides a widely accessible and expressive interface f...
read it
-
Training RNNs as Fast as CNNs
Common recurrent neural network architectures scale poorly due to the in...
read it
-
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
We propose to directly map raw visual observations and text input to act...
read it