Représentations lexicales pour la détection non supervisée d'événements dans un flux de tweets : étude sur des corpus français et anglais

01/13/2020
by   Béatrice Mazoyer, et al.
1

In this work, we evaluate the performance of recent text embeddings for the automatic detection of events in a stream of tweets. We model this task as a dynamic clustering problem.Our experiments are conducted on a publicly available corpus of tweets in English and on a similar dataset in French annotated by our team. We show that recent techniques based on deep neural networks (ELMo, Universal Sentence Encoder, BERT, SBERT), although promising on many applications, are not very suitable for this task. We also experiment with different types of fine-tuning to improve these results on French data. Finally, we propose a detailed analysis of the results obtained, showing the superiority of tf-idf approaches for this task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2020

WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

In this paper, we provide an overview of the WNUT-2020 shared task on th...
research
05/04/2020

Storing, preprocessing and analyzing Tweets: Finding the suitable NoSQL system

NoSQL systems are a new generation of databases that aim to handle a lar...
research
06/05/2022

Speech Detection Task Against Asian Hate: BERT the Central, While Data-Centric Studies the Crucial

With the epidemic continuing, hatred against Asians is intensifying in c...
research
09/08/2020

Covid-Transformer: Detecting COVID-19 Trending Topics on Twitter Using Universal Sentence Encoder

The novel corona-virus disease (also known as COVID-19) has led to a pan...
research
06/20/2020

Sarcasm Detection in Tweets with BERT and GloVe Embeddings

Sarcasm is a form of communication in whichthe person states opposite of...
research
03/28/2017

Is This a Joke? Detecting Humor in Spanish Tweets

While humor has been historically studied from a psychological, cognitiv...
research
05/03/2018

Binarizer at SemEval-2018 Task 3: Parsing dependency and deep learning for irony detection

In this paper, we describe the system submitted for the SemEval 2018 Tas...

Please sign up or login with your details

Forgot password? Click here to reset