Transferring Knowledge Distillation for Multilingual Social Event Detection

08/06/2021
by   Jiaqian Ren, et al.
8

Recently published graph neural networks (GNNs) show promising performance at social event detection tasks. However, most studies are oriented toward monolingual data in languages with abundant training samples. This has left the more common multilingual settings and lesser-spoken languages relatively unexplored. Thus, we present a GNN that incorporates cross-lingual word embeddings for detecting events in multilingual data streams. The first exploit is to make the GNN work with multilingual data. For this, we outline a construction strategy that aligns messages in different languages at both the node and semantic levels. Relationships between messages are established by merging entities that are the same but are referred to in different languages. Non-English message representations are converted into English semantic space via the cross-lingual word embeddings. The resulting message graph is then uniformly encoded by a GNN model. In special cases where a lesser-spoken language needs to be detected, a novel cross-lingual knowledge distillation framework, called CLKD, exploits prior knowledge learned from similar threads in English to make up for the paucity of annotated data. Experiments on both synthetic and real-world datasets show the framework to be highly effective at detection in both multilingual data and in languages where training samples are scarce.

READ FULL TEXT

page 5

page 13

page 15

page 17

page 18

research
04/11/2017

ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge

This paper describes Luminoso's participation in SemEval 2017 Task 2, "M...
research
10/04/2018

Neural Networks for Cross-lingual Negation Scope Detection

Negation scope has been annotated in several English and Chinese corpora...
research
10/07/2022

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval

Multilingual text-video retrieval methods have improved significantly in...
research
02/03/2021

Bootstrapping Multilingual AMR with Contextual Word Alignments

We develop high performance multilingualAbstract Meaning Representation ...
research
11/22/2019

Multilingual Culture-Independent Word Analogy Datasets

In text processing, deep neural networks mostly use word embeddings as a...
research
05/30/2023

Research on Multilingual News Clustering Based on Cross-Language Word Embeddings

Classifying the same event reported by different countries is of signifi...

Please sign up or login with your details

Forgot password? Click here to reset