Benchmarks for Automated Commonsense Reasoning: A Survey

02/09/2023
by   Ernest Davis, et al.
0

More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems. However, these benchmarks are often flawed and many aspects of common sense remain untested. Consequently, we do not currently have any reliable way of measuring to what extent existing AI systems have achieved these abilities. This paper surveys the development and uses of AI commonsense benchmarks. We discuss the nature of common sense; the role of common sense in AI; the goals served by constructing commonsense benchmarks; and desirable features of commonsense benchmarks. We analyze the common flaws in benchmarks, and we argue that it is worthwhile to invest the work needed ensure that benchmark examples are consistently high quality. We survey the various methods of constructing commonsense benchmarks. We enumerate 139 commonsense benchmarks that have been developed: 102 text-based, 18 image-based, 12 video based, and 7 simulated physical environments. We discuss the gaps in the existing benchmarks and aspects of commonsense reasoning that are not addressed in any existing benchmark. We conclude with a number of recommendations for future development of commonsense AI benchmarks.

READ FULL TEXT

page 5

page 8

research
06/15/2020

Machine Common Sense

Machine common sense remains a broad, potentially unbounded problem in a...
research
01/23/2023

Mathematics, word problems, common sense, and artificial intelligence

The paper discusses the capacities and limitations of current artificial...
research
12/21/2020

Exploring and Analyzing Machine Commonsense Benchmarks

Commonsense question-answering (QA) tasks, in the form of benchmarks, ar...
research
12/23/2021

Toward a New Science of Common Sense

Common sense has always been of interest in AI, but has rarely taken cen...
research
11/26/2021

AI and the Everything in the Whole Wide World Benchmark

There is a tendency across different subfields in AI to valorize a small...
research
11/06/2014

The Limitations of Standardized Science Tests as Benchmarks for Artificial Intelligence Research: Position Paper

In this position paper, I argue that standardized tests for elementary s...
research
12/30/2019

Using ConceptNet to Teach Common Sense to an Automated Theorem Prover

The CoRg system is a system to solve commonsense reasoning problems. The...

Please sign up or login with your details

Forgot password? Click here to reset