Contrastive Out-of-Distribution Detection for Pretrained Transformers

04/18/2021
by   Wenxuan Zhou, et al.
0

Pretrained transformers achieve remarkable performance when the test data follows the same distribution as the training data. However, in real-world NLU tasks, the model often faces out-of-distribution (OoD) instances. Such instances can cause the severe semantic shift problem to inference, hence they are supposed to be identified and rejected by the model. In this paper, we study the OoD detection problem for pretrained transformers using only in-distribution data in training. We observe that such instances can be found using the Mahalanobis distance in the penultimate layer. We further propose a contrastive loss that improves the compactness of representations, such that OoD instances can be better differentiated from in-distribution ones. Experiments on the GLUE benchmark demonstrate the effectiveness of the proposed methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset