Phy-Q: A Benchmark for Physical Reasoning

08/31/2021
by   Cheng Xue, et al.
16

Humans are well-versed in reasoning about the behaviors of physical objects when choosing actions to accomplish tasks, while it remains a major challenge for AI. To facilitate research addressing this problem, we propose a new benchmark that requires an agent to reason about physical scenarios and take an action accordingly. Inspired by the physical knowledge acquired in infancy and the capabilities required for robots to operate in real-world environments, we identify 15 essential physical scenarios. For each scenario, we create a wide variety of distinct task templates, and we ensure all the task templates within the same scenario can be solved by using one specific physical rule. By having such a design, we evaluate two distinct levels of generalization, namely the local generalization and the broad generalization. We conduct an extensive evaluation with human players, learning agents with varying input types and architectures, and heuristic agents with different strategies. The benchmark gives a Phy-Q (physical reasoning quotient) score that reflects the physical reasoning ability of the agents. Our evaluation shows that 1) all agents fail to reach human performance, and 2) learning agents, even with good local generalization ability, struggle to learn the underlying physical reasoning rules and fail to generalize broadly. We encourage the development of intelligent agents with broad generalization abilities in physical domains.

READ FULL TEXT

page 5

page 14

page 15

page 16

page 17

page 18

page 19

page 24

research
06/17/2021

Hi-Phy: A Benchmark for Hierarchical Physical Reasoning

Reasoning about the behaviour of physical objects is a key capability of...
research
03/03/2023

NovPhy: A Testbed for Physical Reasoning in Open-world Environments

Due to the emergence of AI systems that interact with the physical envir...
research
10/16/2021

Learning to Solve Complex Tasks by Talking to Agents

Humans often solve complex problems by interacting (in natural language)...
research
08/15/2019

PHYRE: A New Benchmark for Physical Reasoning

Understanding and reasoning about physics is an important ability of int...
research
07/22/2019

The Tools Challenge: Rapid Trial-and-Error Learning in Physical Problem Solving

Many animals, and an increasing number of artificial agents, display sop...
research
11/29/2021

An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments

Visual navigation by mobile robots is classically tackled through SLAM p...
research
02/28/2023

Scenarios and branch points to future machine intelligence

We discuss scenarios and branch points to four major possible consequenc...

Please sign up or login with your details

Forgot password? Click here to reset