
Improving Consequential Decision Making under Imperfect Predictions
Consequential decisions are increasingly informed by sophisticated datadriven predictive models. For accurate predictive models, deterministic threshold rules have been shown to be optimal in terms of utility, even under a variety of fairness constraints. However, consistently learning accurate models requires access to ground truth data. Unfortunately, in practice, some data can only be observed if a certain decision was taken. Thus, collected data always depends on potentially imperfect historical decision policies. As a result, learned deterministic threshold rules are often suboptimal. We address the above question from the perspective of sequential policy learning. We first show that, if decisions are taken by a faulty deterministic policy, the observed outcomes under this policy are insufficient to improve it. We then describe how this undesirable behavior can be avoided using stochastic policies. Finally, we introduce a practical gradientbased algorithm to learn stochastic policies that effectively leverage the outcomes of decisions to improve over time. Experiments on both synthetic and realworld data illustrate our theoretical results and show the efficacy of our proposed algorithm.
02/08/2019 ∙ by Niki Kilbertus, et al. ∙ 6 ∙ shareread it

General Latent Feature Modeling for Data Exploration Tasks
This paper introduces a general Bayesian non parametric latent feature model suitable to per form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while can be inferred in linear time with respect to the number of objects and attributes. Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i.e., the number of features necessary to capture the latent structure in the data. Third, the latent features in the model are binaryvalued variables, easing the interpretability of the obtained latent features in data exploration tasks.
07/26/2017 ∙ by Isabel Valera, et al. ∙ 0 ∙ shareread it

From Parity to Preferencebased Notions of Fairness in Classification
The adoption of automated, datadriven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups. In this context, a number of recent studies have focused on defining, detecting, and removing unfairness from datadriven decision systems. However, the existing notions of fairness, based on parity (equality) in treatment or outcomes for different social groups, tend to be quite stringent, limiting the overall decision making accuracy. In this paper, we draw inspiration from the fairdivision and envyfreeness literature in economics and game theory and propose preferencebased notions of fairness  given the choice between various sets of decision treatments or outcomes, any group of users would collectively prefer its treatment or outcomes, regardless of the (dis)parity as compared to the other groups. Then, we introduce tractable proxies to design marginbased classifiers that satisfy these preferencebased notions of fairness. Finally, we experiment with a variety of synthetic and realworld datasets and show that preferencebased fairness allows for greater decision accuracy than paritybased fairness.
06/30/2017 ∙ by Muhammad Bilal Zafar, et al. ∙ 0 ∙ shareread it

General Latent Feature Models for Heterogeneous Datasets
Latent feature modeling allows capturing the latent structure responsible for generating the observed properties of a set of objects. It is often used to make predictions either for new values of interest or missing information in the original data, as well as to perform data exploratory analysis. However, although there is an extensive literature on latent feature models for homogeneous datasets, where all the attributes that describe each object are of the same (continuous or discrete) nature, there is a lack of work on latent feature modeling for heterogeneous databases. In this paper, we introduce a general Bayesian nonparametric latent feature model suitable for heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while keeping the properties of conjugate models, which allow us to infer the model in linear time with respect to the number of objects and attributes. Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i.e., the number of features necessary to capture the latent structure in the data. Third, the latent features in the model are binaryvalued variables, easing the interpretability of the obtained latent features in data exploratory analysis. We show the flexibility of the proposed model by solving both prediction and data analysis tasks on several realworld datasets. Moreover, a software package of the GLFM is publicly available for other researcher to use and improve it.
06/12/2017 ∙ by Isabel Valera, et al. ∙ 0 ∙ shareread it

Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment
Automated datadriven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is quite possible that the optimally trained classifier makes decisions for people belonging to different social groups with different misclassification rates (e.g., misclassification rates for females are higher than for males), thereby placing these groups at an unfair disadvantage. To account for and avoid such unfairness, in this paper, we introduce a new notion of unfairness, disparate mistreatment, which is defined in terms of misclassification rates. We then propose intuitive measures of disparate mistreatment for decision boundarybased classifiers, which can be easily incorporated into their formulation as convexconcave constraints. Experiments on synthetic as well as real world datasets show that our methodology is effective at avoiding disparate mistreatment, often at a small cost in terms of accuracy.
10/26/2016 ∙ by Muhammad Bilal Zafar, et al. ∙ 0 ∙ shareread it

Distilling Information Reliability and Source Trustworthiness from Digital Traces
Online knowledge repositories typically rely on their users or dedicated editors to evaluate the reliability of their content. These evaluations can be viewed as noisy measurements of both information reliability and information source trustworthiness. Can we leverage these noisy evaluations, often biased, to distill a robust, unbiased and interpretable measure of both notions? In this paper, we argue that the temporal traces left by these noisy evaluations give cues on the reliability of the information and the trustworthiness of the sources. Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness. Furthermore, we develop an efficient convex optimization procedure to learn the parameters of the model from historical traces. Experiments on realworld data gathered from Wikipedia and Stack Overflow show that our modeling framework accurately predicts evaluation events, provides an interpretable measure of information reliability and source trustworthiness, and yields interesting insights about realworld events.
10/24/2016 ∙ by Behzad Tabibian, et al. ∙ 0 ∙ shareread it

Modeling the Dynamics of Online Learning Activity
People are increasingly relying on the Web and social media to find solutions to their problems in a wide range of domains. In this online setting, closely related problems often lead to the same characteristic learning pattern, in which people sharing these problems visit related pieces of information, perform almost identical queries or, more generally, take a series of similar actions. In this paper, we introduce a novel modeling framework for clustering continuoustime grouped streaming data, the hierarchical Dirichlet Hawkes process (HDHP), which allows us to automatically uncover a wide variety of learning patterns from detailed traces of learning activity. Our model allows for efficient inference, scaling to millions of actions taken by thousands of users. Experiments on real data gathered from Stack Overflow reveal that our framework can recover meaningful learning patterns in terms of both content and temporal dynamics, as well as accurately track users' interests and goals over time.
10/18/2016 ∙ by Charalampos Mavroforakis, et al. ∙ 0 ∙ shareread it

Bayesian nonparametric comorbidity analysis of psychiatric disorders
The analysis of comorbidity is an open and complex research field in the branch of psychiatry, where clinical experience and several studies suggest that the relation among the psychiatric disorders may have etiological and treatment implications. In this paper, we are interested in applying latent feature modeling to find the latent structure behind the psychiatric disorders that can help to examine and explain the relationships among them. To this end, we use the large amount of information collected in the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database and propose to model these data using a nonparametric latent model based on the Indian Buffet Process (IBP). Due to the discrete nature of the data, we first need to adapt the observation model for discrete random variables. We propose a generative model in which the observations are drawn from a multinomiallogit distribution given the IBP matrix. The implementation of an efficient Gibbs sampler is accomplished using the Laplace approximation, which allows integrating out the weighting factors of the multinomiallogit likelihood model. We also provide a variational inference algorithm for this model, which provides a complementary (and less expensive in terms of computational complexity) alternative to the Gibbs sampler allowing us to deal with a larger number of data. Finally, we use the model to analyze comorbidity among the psychiatric disorders diagnosed by experts from the NESARC database.
01/29/2014 ∙ by Francisco J. R. Ruiz, et al. ∙ 0 ∙ shareread it

Boosting Black Box Variational Inference
Approximating a probability density in a tractable manner is a central task in Bayesian statistics. Variational Inference (VI) is a popular technique that achieves tractability by choosing a relatively simple variational family. Borrowing ideas from the classic boosting framework, recent approaches attempt to boost VI by replacing the selection of a single density with a greedily constructed mixture of densities. In order to guarantee convergence, previous works impose stringent assumptions that require significant effort for practitioners. Specifically, they require a custom implementation of the greedy step (called the LMO) for every probabilistic model with respect to an unnatural variational family of truncated distributions. Our work fixes these issues with novel theoretical and algorithmic insights. On the theoretical side, we show that boosting VI satisfies a relaxed smoothness assumption which is sufficient for the convergence of the functional FrankWolfe (FW) algorithm. Furthermore, we rephrase the LMO problem and propose to maximize the Residual ELBO (RELBO) which replaces the standard ELBO optimization in VI. These theoretical enhancements allow for black box implementation of the boosting subroutine. Finally, we present a stopping criterion drawn from the duality gap in the classic FW analyses and exhaustive experiments to illustrate the usefulness of our theoretical and algorithmic contributions.
06/06/2018 ∙ by Francesco Locatello, et al. ∙ 0 ∙ shareread it

Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for density estimation, even when taking into account mixtures of probabilistic models, are not flexible enough to deal with the uncertainty inherent to realworld data: they are generally restricted to a priori fixed homogeneous likelihood model and to latent variable structures where expressiveness comes at the price of tractability. We propose Automatic Bayesian Density Analysis (ABDA) to go beyond classical mixture model density estimation, casting uncertainty estimation on both the underlying structure in the data, as well as the selection of adequate likelihood models for the datathus statistical data types of the variable in the datainto a joint inference problem. Specifically, ABDA relies on a hierarchical model explicitly incorporating arbitrarily rich collections of likelihood models at a local level, while capturing global variable interactions by an expressive deep structure built on a sumproduct network. Extensive empirical evidence shows that ABDA is more accurate than density estimators in the literature at dealing with both kinds of uncertainties, at modeling and predicting realworld (mixed continuous and discrete) data in both transductive and inductive scenarios, and at recovering the statistical data types.
07/24/2018 ∙ by Antonio Vergari, et al. ∙ 0 ∙ shareread it

Handling Incomplete Heterogeneous Data using VAEs
Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate to capture the latent structure of vast amounts of complex highdimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in realworld applications. In this paper, we propose a general framework to design VAEs, suitable for fitting incomplete heterogenous data. The proposed HIVAE includes likelihood models for realvalued, positive real valued, interval, categorical, ordinal and count data, and allows to estimate (and potentially impute) missing data accurately. Furthermore, HIVAE presents competitive predictive performance in supervised tasks, outperforming supervised models when trained on incomplete data.
07/10/2018 ∙ by Alfredo Nazabal, et al. ∙ 0 ∙ shareread it
Isabel Valera
is this you? claim profile