Log In Sign Up

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World

by   Rowan Zellers, et al.

We propose PIGLeT: a model that learns physical commonsense knowledge through interaction, and then uses this knowledge to ground language. We factorize PIGLeT into a physical dynamics model, and a separate language model. Our dynamics model learns not just what objects are but also what they do: glass cups break when thrown, plastic ones don't. We then use it as the interface to our language model, giving us a unified model of linguistic form and grounded meaning. PIGLeT can read a sentence, simulate neurally what might happen next, and then communicate that result through a literal symbolic representation, or natural language. Experimental results show that our model effectively learns world dynamics, along with how to communicate them. It is able to correctly forecast "what happens next" given an English sentence over 80 100x larger, text-to-text approach by over 10 summaries of physical interactions are also judged by humans as more accurate than LM alternatives. We present comprehensive analysis showing room for future work.


page 1

page 8

page 13


Understanding Learning Dynamics Of Language Models with SVCCA

Recent work has demonstrated that neural language models encode linguist...

Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning

In natural language processing, most models try to learn semantic repres...

SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark

Existing work in language grounding typically study single environments....

Mind's Eye: Grounded Language Model Reasoning through Simulation

Successful and effective communication between humans and AI relies on a...

Probing Text Models for Common Ground with Visual Representations

Vision, as a central component of human perception, plays a fundamental ...

Indirectly Supervised English Sentence Break Prediction Using Paragraph Break Probability Estimates

This report explores the use of paragraph break probability estimates to...