Language Transfer for Early Warning of Epidemics from Social Media

10/10/2019
by   Mattias Appelgren, et al.
0

Statements on social media can be analysed to identify individuals who are experiencing red flag medical symptoms, allowing early detection of the spread of disease such as influenza. Since disease does not respect cultural borders and may spread between populations speaking different languages, we would like to build multilingual models. However, the data required to train models for every language may be difficult, expensive and time-consuming to obtain, particularly for low-resource languages. Taking Japanese as our target language, we explore methods by which data in one language might be used to build models for a different language. We evaluate strategies of training on machine translated data and of zero-shot transfer through the use of multilingual models. We find that the choice of source language impacts the performance, with Chinese-Japanese being a better language pair than English-Japanese. Training on machine translated data shows promise, especially when used in conjunction with a small amount of target language data.

READ FULL TEXT
research
09/27/2021

Rumour Detection via Zero-shot Cross-lingual Transfer Learning

Most rumour detection models for social media are designed for one speci...
research
10/20/2021

Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction

We evaluate a simple approach to improving zero-shot multilingual transf...
research
10/11/2020

Detecting Foodborne Illness Complaints in Multiple Languages Using English Annotations Only

Health departments have been deploying text classification systems for t...
research
02/01/2019

Multilingual NER Transfer for Low-resource Languages

In massively multilingual transfer NLP models over many source languages...
research
09/13/2021

Exploring a Unified Sequence-To-Sequence Transformer for Medical Product Safety Monitoring in Social Media

Adverse Events (AE) are harmful events resulting from the use of medical...
research
05/03/2021

Looking for COVID-19 misinformation in multilingual social media texts

This paper presents the Multilingual COVID-19 Analysis Method (CMTA) for...
research
03/14/2019

OffensEval at SemEval-2018 Task 6: Identifying and Categorizing Offensive Language in Social Media

This document describes our approach to building an Offensive Language C...

Please sign up or login with your details

Forgot password? Click here to reset