Investigating underdiagnosis of AI algorithms in the presence of multiple sources of dataset bias

by   Melanie Bernhardt, et al.

Deep learning models have shown great potential for image-based diagnosis assisting clinical decision making. At the same time, an increasing number of reports raise concerns about the potential risk that machine learning could amplify existing health disparities due to human biases that are embedded in the training data. It is of great importance to carefully investigate the extent to which biases may be reproduced or even amplified if we wish to build fair artificial intelligence systems. Seyyed-Kalantari et al. advance this conversation by analysing the performance of a disease classifier across population subgroups. They raise performance disparities related to underdiagnosis as a point of concern; we identify areas from this analysis which we believe deserve additional attention. Specifically, we wish to highlight some theoretical and practical difficulties associated with assessing model fairness through testing on data drawn from the same biased distribution as the training data, especially when the sources and amount of biases are unknown.



There are no comments yet.


page 1


Towards Fairness Certification in Artificial Intelligence

Thanks to the great progress of machine learning in the last years, seve...

Fairness Score and Process Standardization: Framework for Fairness Certification in Artificial Intelligence Systems

Decisions made by various Artificial Intelligence (AI) systems greatly i...

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms

Sampling biases in training data are a major source of algorithmic biase...

FairCVtest Demo: Understanding Bias in Multimodal Learning with a Testbed in Fair Automatic Recruitment

With the aim of studying how current multimodal AI algorithms based on h...

(Un)fairness in Post-operative Complication Prediction Models

With the current ongoing debate about fairness, explainability and trans...

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

It has been rightfully emphasized that the use of AI for clinical decisi...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Author contributions

All authors contributed equally to this work in terms of formulating the arguments, interpreting the available evidence, and co-writing the manuscript.

Competing interests

B.G. is part-time employee of HeartFlow and Kheiron Medical Technologies and holds stock options as part of the standard compensation package.