Identifying Biased Subgroups in Ranking and Classification

08/17/2021
by   Eliana Pastor, et al.
0

When analyzing the behavior of machine learning algorithms, it is important to identify specific data subgroups for which the considered algorithm shows different performance with respect to the entire dataset. The intervention of domain experts is normally required to identify relevant attributes that define these subgroups. We introduce the notion of divergence to measure this performance difference and we exploit it in the context of (i) classification models and (ii) ranking applications to automatically detect data subgroups showing a significant deviation in their behavior. Furthermore, we quantify the contribution of all attributes in the data subgroup to the divergent behavior by means of Shapley values, thus allowing the identification of the most impacting attributes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2017

An influence-based fast preceding questionnaire model for elderly assessments

To improve the efficiency of elderly assessments, an influence-based fas...
research
12/27/2019

Predicting Attributes of Nodes Using Network Structure

In many graphs such as social networks, nodes have associated attributes...
research
06/30/2022

Discrimination in machine learning algorithms

Machine learning algorithms are routinely used for business decisions th...
research
03/30/2022

Robust Reputation Independence in Ranking Systems for Multiple Sensitive Attributes

Ranking systems have an unprecedented influence on how and what informat...
research
01/01/1997

Improved Heterogeneous Distance Functions

Instance-based learning techniques typically handle continuous and linea...
research
05/08/2017

Assisting Service Providers In Peer-to-peer Marketplaces: Maximizing Gain Over Flexible Attributes

Peer to peer marketplaces such as AirBnB enable transactional exchange o...

Please sign up or login with your details

Forgot password? Click here to reset