Counterfactual harm

04/27/2022
by   Jonathan G. Richens, et al.
0

To act safely and ethically in the real world, agents must be able to reason about harm and avoid harmful actions. In this paper we develop the first statistical definition of harm and a framework for incorporating harm into algorithmic decisions. We argue that harm is fundamentally a counterfactual quantity, and show that standard machine learning algorithms that cannot perform counterfactual reasoning are guaranteed to pursue harmful policies in certain environments. To resolve this we derive a family of counterfactual objective functions that robustly mitigate for harm. We demonstrate our approach with a statistical model for identifying optimal drug doses. While standard algorithms that select doses using causal treatment effects result in significant harm, our counterfactual algorithm identifies doses that are significantly less harmful without sacrificing efficacy. Our results show that counterfactual reasoning is a key ingredient for safe and ethical AI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2017

Counterfactual Fairness

Machine learning can impact people with legal or ethical consequences wh...
research
11/24/2022

On the Complexity of Counterfactual Reasoning

We study the computational complexity of counterfactual reasoning in rel...
research
09/28/2020

Lockdown effects in US states: an artificial counterfactual approach

We adopt an artificial counterfactual approach to assess the impact of l...
research
08/21/2022

Twin Papers: A Simple Framework of Causal Inference for Citations via Coupling

The research process includes many decisions, e.g., how to entitle and w...
research
12/16/2020

Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation

Reinforcement learning (RL) algorithms usually require a substantial amo...
research
01/29/2021

Counterfactual Planning in AGI Systems

We present counterfactual planning as a design approach for creating a r...
research
08/09/2021

The External Validity of Combinatorial Samples and Populations

The widely used 'Counterfactual' definition of Causal Effects was derive...

Please sign up or login with your details

Forgot password? Click here to reset