Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation

by   Sam Devlin, et al.

A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness. While human assessments of such behavior can be highly accurate, speed and scalability are limited. We address these limitations through a novel automated Navigation Turing Test (ANTT) that learns to predict human judgments of human-likeness. We demonstrate the effectiveness of our automated NTT on a navigation task in a complex 3D environment. We investigate six classification models to shed light on the types of architectures best suited to this task, and validate them against data collected through a human NTT. Our best models achieve high accuracy when distinguishing true human and agent behavior. At the same time, we show that predicting finer-grained human assessment of agents' progress towards human-like behavior remains unsolved. Our work takes an important step towards agents that more effectively learn complex human-like behavior.


page 3

page 5

page 7


Human-Like Navigation Behavior: A Statistical Evaluation Framework

Recent advancements in deep reinforcement learning have brought forth an...

A design of human-like robust AI machines in object identification

This is a perspective paper inspired from the study of Turing Test propo...

Active Dynamical Prospection: Modeling Mental Simulation as Particle Filtering for Sensorimotor Control during Pathfinding

What do humans do when confronted with a common challenge: we know where...

Toward a Human-Level Video Understanding Intelligence

We aim to develop an AI agent that can watch video clips and have a conv...

Success Weighted by Completion Time: A Dynamics-Aware Evaluation Criteria for Embodied Navigation

We present Success weighted by Completion Time (SCT), a new metric for e...

Simon's Anthill: Mapping and Navigating Belief Spaces

In the parable of Simon's Ant, an ant follows a complex path along a bea...

Learning Personalized Models of Human Behavior in Chess

Even when machine learning systems surpass human ability in a domain, th...