WikiContradiction: Detecting Self-Contradiction Articles on Wikipedia
While Wikipedia has been utilized for fact-checking and claim verification to debunk misinformation and disinformation, it is essential to either improve article quality and rule out noisy articles. Self-contradiction is one of the low-quality article types in Wikipedia. In this work, we propose a task of detecting self-contradiction articles in Wikipedia. Based on the "self-contradictory" template, we create a novel dataset for the self-contradiction detection task. Conventional contradiction detection focuses on comparing pairs of sentences or claims, but self-contradiction detection needs to further reason the semantics of an article and simultaneously learn the contradiction-aware comparison from all pairs of sentences. Therefore, we present the first model, Pairwise Contradiction Neural Network (PCNN), to not only effectively identify self-contradiction articles, but also highlight the most contradiction pairs of contradiction sentences. The main idea of PCNN is two-fold. First, to mitigate the effect of data scarcity on self-contradiction articles, we pre-train the module of pairwise contradiction learning using SNLI and MNLI benchmarks. Second, we select top-K sentence pairs with the highest contradiction probability values and model their correlation to determine whether the corresponding article belongs to self-contradiction. Experiments conducted on the proposed WikiContradiction dataset exhibit that PCNN can generate promising performance and comprehensively highlight the sentence pairs the contradiction locates.
READ FULL TEXT