Learning Controllable Fair Representations
Learning data representations that are transferable and fair with respect to certain protected attributes is crucial to reducing unfair decisions made downstream, while preserving the utility of the data. We propose an information-theoretically motivated objective for learning maximally expressive representations subject to fairness constraints. We demonstrate that a range of existing approaches optimize approximations to the Lagrangian dual of our objective. In contrast to these existing approaches, our objective provides the user control over the fairness of representations by specifying limits on unfairness. We introduce a dual optimization method that optimizes the model as well as the expressiveness-fairness trade-off. Empirical evidence suggests that our proposed method can account for multiple notions of fairness and achieves higher expressiveness at a lower computational cost.
READ FULL TEXT