Adversarial Learning and Explainability in Structured Datasets

10/15/2018
by   Prasad Chalasani, et al.
0

We theoretically and empirically explore the explainability benefits of adversarial learning in logistic regression models on structured datasets. In particular we focus on improved explainability due to significantly higher _feature-concentration_ in adversarially-learned models: Compared to natural training, adversarial training tends to much more efficiently shrink the weights of non-predictive and weakly-predictive features, while model performance on natural test data only degrades slightly (and even sometimes improves), compared to that of a naturally trained model. We provide a theoretical insight into this phenomenon via an analysis of the expectation of the logistic model weight updates by an SGD-based adversarial learning algorithm, where examples are drawn from a random binary data-generation process. We empirically demonstrate the feature-pruning effect on a synthetic dataset, some datasets from the UCI repository, and real-world large-scale advertising response-prediction data-sets from MediaMath. In several of the MediaMath datasets there are 10s of millions of data points, and on the order of 100,000 sparse categorical features, and adversarial learning often results in model-size reduction by a factor of 20 or higher, and yet the model performance on natural test data (measured by AUC) is comparable to (and sometimes even better) than that of the naturally trained model. We also show that traditional ℓ_1 regularization does not even come close to achieving this level of feature-concentration. We measure "feature concentration" using the Integrated Gradients-based feature-attribution method of Sundararajan et. al (2017), and derive a new closed-form expression for 1-layer networks, which substantially speeds up computation of aggregate feature attributions across a large dataset.

READ FULL TEXT

page 19

page 21

page 22

research
01/29/2019

Improved Adversarial Learning for Fair Classification

Motivated by concerns that machine learning algorithms may introduce sig...
research
06/21/2021

f-Domain-Adversarial Learning: Theory and Algorithms

Unsupervised domain adaptation is used in many machine learning applicat...
research
02/05/2021

Adversarial Training Makes Weight Loss Landscape Sharper in Logistic Regression

Adversarial training is actively studied for learning robust models agai...
research
12/09/2019

Logistic regression models for aggregated data

Logistic regression models are a popular and effective method to predict...
research
04/21/2019

Beyond Explainability: Leveraging Interpretability for Improved Adversarial Learning

In this study, we propose the leveraging of interpretability for tasks b...
research
10/09/2021

Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach

We target open-world feature extrapolation problem where the feature spa...

Please sign up or login with your details

Forgot password? Click here to reset