Improving robustness of classifiers by training against live traffic

12/01/2018
by   Kumar Sricharan, et al.
0

Deep learning models are known to be overconfident in their predictions on out of distribution inputs. This is a challenge when a model is trained on a particular input dataset, but receives out of sample data when deployed in practice. Recently, there has been work on building classifiers that are robust to out of distribution samples by adding a regularization term that maximizes the entropy of the classifier output on out of distribution data. However, given the challenge that it is not always possible to obtain out of distribution samples, the authors suggest a GAN based alternative that is independent of specific knowledge of out of distribution samples. From this existing work, we also know that having access to the true out of sample distribution for regularization works significantly better than using samples from the GAN. In this paper, we make the following observation: in practice, the out of distribution samples are contained in the traffic that hits a deployed classifier. However, the traffic will also contain a unknown proportion of in-distribution samples. If the entropy over of all of the traffic data were to be naively maximized, this will hurt the classifier performance on in-distribution data. To effectively leverage this traffic data, we propose an adaptive regularization technique (based on the maximum predictive probability score of a sample) which penalizes out of distribution samples more heavily than in distribution samples in the incoming traffic. This ensures that the overall performance of the classifier does not degrade on in-distribution data, while detection of out-of-distribution samples is significantly improved by leveraging the unlabeled traffic data. We show the effectiveness of our method via experiments on natural image datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2018

Building robust classifiers through generation of confident out of distribution examples

Deep learning models are known to be overconfident in their predictions ...
research
10/08/2019

Credible Sample Elicitation by Deep Learning, for Deep Learning

It is important to collect credible training samples (x,y) for building ...
research
10/09/2019

Out-of-distribution Detection in Classifiers via Generation

By design, discriminatively trained neural network classifiers produce r...
research
12/13/2021

WOOD: Wasserstein-based Out-of-Distribution Detection

The training and test data for deep-neural-network-based classifiers are...
research
09/19/2018

Exploring the Impact of Password Dataset Distribution on Guessing

Leaks from password datasets are a regular occurrence. An organization m...
research
02/01/2023

Distributed Traffic Synthesis and Classification in Edge Networks: A Federated Self-supervised Learning Approach

With the rising demand for wireless services and increased awareness of ...
research
05/10/2023

Finding Meaningful Distributions of ML Black-boxes under Forensic Investigation

Given a poorly documented neural network model, we take the perspective ...

Please sign up or login with your details

Forgot password? Click here to reset