PROST: Physical Reasoning of Objects through Space and Time

06/07/2021
by   Stephane Aroca-Ouellette, et al.
0

We present a new probing dataset named PROST: Physical Reasoning about Objects Through Space and Time. This dataset contains 18,736 multiple-choice questions made from 14 manually curated templates, covering 10 physical reasoning concepts. All questions are designed to probe both causal and masked language models in a zero-shot setting. We conduct an extensive analysis which demonstrates that state-of-the-art pretrained models are inadequate at physical reasoning: they are influenced by the order in which answer options are presented to them, they struggle when the superlative in a question is inverted (e.g., most <-> least), and increasing the amount of pretraining data and parameters only yields minimal improvements. These results provide support for the hypothesis that current pretrained models' ability to reason about physical interactions is inherently limited by a lack of real world experience. By highlighting these limitations, we hope to motivate the development of models with a human-like understanding of the physical world.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2022

PACS: A Dataset for Physical Audiovisual CommonSense Reasoning

In order for AI to be safely deployed in real-world scenarios such as ho...
research
11/26/2019

PIQA: Reasoning about Physical Commonsense in Natural Language

To apply eyeshadow without a brush, should I use a cotton swab or a toot...
research
07/17/2022

Can large language models reason about medical questions?

Although large language models (LLMs) often produce impressive outputs, ...
research
02/15/2022

Impact of Pretraining Term Frequencies on Few-Shot Reasoning

Pretrained Language Models (LMs) have demonstrated ability to perform nu...
research
09/16/2022

Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios

The possible consequences for the same context may vary depending on the...
research
05/25/2022

Teaching Broad Reasoning Skills via Decomposition-Guided Contexts

Question-answering datasets require a broad set of reasoning skills. We ...
research
06/05/2023

PokemonChat: Auditing ChatGPT for Pokémon Universe Knowledge

The recently released ChatGPT model demonstrates unprecedented capabilit...

Please sign up or login with your details

Forgot password? Click here to reset