Inferring Capabilities from Task Performance with Bayesian Triangulation

09/21/2023
by   John Burden, et al.
0

As machine learning models become more general, we need to characterise them in richer, more meaningful ways. We describe a method to infer the cognitive profile of a system from diverse experimental data. To do so, we introduce measurement layouts that model how task-instance features interact with system capabilities to affect performance. These features must be triangulated in complex ways to be able to infer capabilities from non-populational data – a challenge for traditional psychometric and inferential tools. Using the Bayesian probabilistic programming library PyMC, we infer different cognitive profiles for agents in two scenarios: 68 actual contestants in the AnimalAI Olympics and 30 synthetic agents for O-PIAAGETS, an object permanence battery. We showcase the potential for capability-oriented evaluation.

READ FULL TEXT

page 1

page 11

page 12

page 14

page 15

page 16

page 18

page 19

research
03/14/2019

Inferring Personalized Bayesian Embeddings for Learning from Heterogeneous Demonstration

For assistive robots and virtual agents to achieve ubiquity, machines wi...
research
06/02/2013

Declarative Modeling and Bayesian Inference of Dark Matter Halos

Probabilistic programming allows specification of probabilistic models i...
research
12/21/2022

A Cognitive Evaluation of Instruction Generation Agents tl;dr They Need Better Theory-of-Mind Capabilities

We mathematically characterize the cognitive capabilities that enable hu...
research
01/06/2021

Controlling Synthetic Characters in Simulations: A Case for Cognitive Architectures and Sigma

Simulations, along with other similar applications like virtual worlds a...
research
12/02/2016

Inferring Cognitive Models from Data using Approximate Bayesian Computation

An important problem for HCI researchers is to estimate the parameter va...
research
07/09/2022

Subclasses of Class Function used to Implement Transformations of Statistical Models

A library of software for inductive inference guided by the Minimum Mess...
research
05/06/2019

Cognitive Triaging of Phishing Attacks

In this paper we employ quantitative measurements of cognitive vulnerabi...

Please sign up or login with your details

Forgot password? Click here to reset