Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training

09/13/2021
by   Momchil Hardalov, et al.
7

The goal of stance detection is to determine the viewpoint expressed in a piece of text towards a target. These viewpoints or contexts are often expressed in many different languages depending on the user and the platform, which can be a local news outlet, a social media platform, a news forum, etc. Most research in stance detection, however, has been limited to working with a single language and on a few limited targets, with little work on cross-lingual stance detection. Moreover, non-English sources of labelled data are often scarce and present additional challenges. Recently, large multilingual language models have substantially improved the performance on many non-English tasks, especially such with limited numbers of examples. This highlights the importance of model pre-training and its ability to learn from few examples. In this paper, we present the most comprehensive study of cross-lingual stance detection to date: we experiment with 15 diverse datasets in 12 languages from 6 language families, and with 6 low-resource evaluation settings each. For our experiments, we build on pattern-exploiting training, proposing the addition of a novel label encoder to simplify the verbalisation procedure. We further propose sentiment-based generation of stance data for pre-training, which shows sizeable improvement of more than 6 to several strong baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2022

Phylogeny-Inspired Adaptation of Multilingual Models to New Languages

Large pretrained multilingual models, trained on dozens of languages, ha...
research
06/13/2023

Soft Language Clustering for Multilingual Model Pre-training

Multilingual pre-trained language models have demonstrated impressive (z...
research
02/24/2021

Task-Specific Pre-Training and Cross Lingual Transfer for Code-Switched Data

Using task-specific pre-training and leveraging cross-lingual transfer a...
research
09/15/2021

Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

Compared to monolingual models, cross-lingual models usually require a m...
research
01/15/2022

Addressing the Challenges of Cross-Lingual Hate Speech Detection

The goal of hate speech detection is to filter negative online content a...
research
05/21/2022

Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training

Keyphrase generation is the task of automatically predicting keyphrases ...
research
04/15/2021

A Survey of Recent Abstract Summarization Techniques

This paper surveys several recent abstract summarization methods: T5, Pe...

Please sign up or login with your details

Forgot password? Click here to reset