Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle

03/17/2019
by   Jiaxiao Zheng, et al.
2

We propose a data-driven framework to enable the modeling and optimization of human-machine interaction processes, e.g., systems aimed at assisting humans in decision-making or learning, work-load allocation, and interactive advertising. This is a challenging problem for several reasons. First, humans' behavior is hard to model or infer, as it may reflect biases, long term memory, and sensitivity to sequencing, i.e., transience and exponential complexity in the length of the interaction. Second, due to the interactive nature of such processes, the machine policy used to engage with a human may bias possible data-driven inferences. Finally, in choosing machine policies that optimize interaction rewards, one must, on the one hand, avoid being overly sensitive to error/variability in the estimated human model, and on the other, being overly deterministic/predictable which may result in poor human 'engagement' in the interaction. To meet these challenges, we propose a robust approach, based on the maximum entropy principle, which iteratively estimates human behavior and optimizes the machine policy--Alternating Entropy-Reward Ascent (AREA) algorithm. We characterize AREA, in terms of its space and time complexity and convergence. We also provide an initial validation based on synthetic data generated by an established noisy nonlinear model for human decision-making.

READ FULL TEXT

page 3

page 4

page 7

page 8

page 9

page 12

page 17

page 20

research
04/14/2023

Synthetically Generating Human-like Data for Sequential Decision Making Tasks via Reward-Shaped Imitation Learning

We consider the problem of synthetically generating data that can closel...
research
07/11/2019

Reward Advancement: Transforming Policy under Maximum Causal Entropy Principle

Many real-world human behaviors can be characterized as a sequential dec...
research
03/23/2020

Anticipatory Psychological Models for Quickest Change Detection: Human Sensor Interaction

We consider anticipatory psychological models for human decision makers ...
research
09/13/2022

Investigating Bias with a Synthetic Data Generator: Empirical Evidence and Philosophical Interpretation

Machine learning applications are becoming increasingly pervasive in our...
research
10/22/2022

Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems

Long-term fairness is an important factor of consideration in designing ...
research
01/18/2023

Sequential Processing of Observations in Human Decision-Making Systems

In this work, we consider a binary hypothesis testing problem involving ...

Please sign up or login with your details

Forgot password? Click here to reset