Certified Robustness to Word Substitution Ranking Attack for Neural Ranking Models

09/14/2022
by   Chen Wu, et al.
0

Neural ranking models (NRMs) have achieved promising results in information retrieval. NRMs have also been shown to be vulnerable to adversarial examples. A typical Word Substitution Ranking Attack (WSRA) against NRMs was proposed recently, in which an attacker promotes a target document in rankings by adding human-imperceptible perturbations to its text. This raises concerns when deploying NRMs in real-world applications. Therefore, it is important to develop techniques that defend against such attacks for NRMs. In empirical defenses adversarial examples are found during training and used to augment the training set. However, such methods offer no theoretical guarantee on the models' robustness and may eventually be broken by other sophisticated WSRAs. To escape this arms race, rigorous and provable certified defense methods for NRMs are needed. To this end, we first define the Certified Top-K Robustness for ranking models since users mainly care about the top ranked results in real-world scenarios. A ranking model is said to be Certified Top-K Robust on a ranked list when it is guaranteed to keep documents that are out of the top K away from the top K under any attack. Then, we introduce a Certified Defense method, named CertDR, to achieve certified top-K robustness against WSRA, based on the idea of randomized smoothing. Specifically, we first construct a smoothed ranker by applying random word substitutions on the documents, and then leverage the ranking property jointly with the statistical property of the ensemble to provably certify top-K robustness. Extensive experiments on two representative web search datasets demonstrate that CertDR can significantly outperform state-of-the-art empirical defense methods for ranking models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2020

One word at a time: adversarial attacks on retrieval models

Adversarial examples, generated by applying small perturbations to input...
research
06/07/2021

Adversarial Attack and Defense in Deep Ranking

Deep Neural Network classifiers are vulnerable to adversarial attack, wh...
research
08/31/2018

MULDEF: Multi-model-based Defense Against Adversarial Examples for Neural Networks

Despite being popularly used in many application domains such as image r...
research
02/26/2020

Adversarial Ranking Attack and Defense

Deep Neural Network (DNN) classifiers are vulnerable to adversarial atta...
research
05/08/2021

Certified Robustness to Text Adversarial Attacks by Randomized [MASK]

Recently, few certified defense methods have been developed to provably ...
research
08/01/2023

Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

The deep neural network has attained significant efficiency in image rec...
research
07/31/2023

Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection

Neural ranking models (NRMs) have undergone significant development and ...

Please sign up or login with your details

Forgot password? Click here to reset