ClassBases at CASE-2022 Multilingual Protest Event Detection Tasks: Multilingual Protest News Detection and Automatically Replicating Manually Created Event Datasets

In this report, we describe our ClassBases submissions to a shared task on multilingual protest event detection. For the multilingual protest news detection, we participated in subtask-1, subtask-2, and subtask-4, which are document classification, sentence classification, and token classification. In subtask-1, we compare XLM-RoBERTa-base, mLUKE-base, and XLM-RoBERTa-large on finetuning in a sequential classification setting. We always use a combination of the training data from every language provided to train our multilingual models. We found that larger models seem to work better and entity knowledge helps but at a non-negligible cost. For subtask-2, we only submitted an mLUKE-base system for sentence classification. For subtask-4, we only submitted an XLM-RoBERTa-base for token classification system for sequence labeling. For automatically replicating manually created event datasets, we participated in COVID-related protest events from the New York Times news corpus. We created a system to process the crawled data into a dataset of protest events.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2022

Extended Multilingual Protest News Detection – Shared Task 1, CASE 2021 and 2022

We report results of the CASE 2022 Shared Task 1 on Multilingual Protest...
research
07/06/2023

MultiVENT: Multilingual Videos of Events with Aligned Natural Text

Everyday news coverage has shifted from traditional broadcasts towards a...
research
10/29/2021

Handshakes AI Research at CASE 2021 Task 1: Exploring different approaches for multilingual tasks

The aim of the CASE 2021 Shared Task 1 (Hürriyetoğlu et al., 2021) was t...
research
05/03/2021

Russian News Clustering and Headline Selection Shared Task

This paper presents the results of the Russian News Clustering and Headl...
research
04/13/2022

Multilingual Event Linking to Wikidata

We present a task of multilingual linking of events to a knowledge base....
research
10/12/2021

Topic-time Heatmaps for Human-in-the-loop Topic Detection and Tracking

The essential task of Topic Detection and Tracking (TDT) is to organize ...

Please sign up or login with your details

Forgot password? Click here to reset