FairCanary: Rapid Continuous Explainable Fairness
Machine Learning (ML) models are being used in all facets of today's society to make high stake decisions like bail granting or credit lending, with very minimal regulations. Such systems are extremely vulnerable to both propagating and amplifying social biases, and have therefore been subject to growing research interest. One of the main issues with conventional fairness metrics is their narrow definitions which hide the complete extent of the bias by focusing primarily on positive and/or negative outcomes, whilst not paying attention to the overall distributional shape. Moreover, these metrics are often contradictory to each other, are severely restrained by the contextual and legal landscape of the problem, have technical constraints like poor support for continuous outputs, the requirement of class labels, and are not explainable. In this paper, we present Quantile Demographic Drift, which addresses the shortcomings mentioned above. This metric can also be used to measure intra-group privilege. It is easily interpretable via existing attribution techniques, and also extends naturally to individual fairness via the principle of like-for-like comparison. We make this new fairness score the basis of a new system that is designed to detect bias in production ML models without the need for labels. We call the system FairCanary because of its capability to detect bias in a live deployed model and narrow down the alert to the responsible set of features, like the proverbial canary in a coal mine.
READ FULL TEXT