TSSuBERT: Tweet Stream Summarization Using BERT

06/16/2021
by   Alexis Dusart, et al.
0

The development of deep neural networks and the emergence of pre-trained language models such as BERT allow to increase performance on many NLP tasks. However, these models do not meet the same popularity for tweet summarization, which can probably be explained by the lack of existing collections for training and evaluation. Our contribution in this paper is twofold : (1) we introduce a large dataset for Twitter event summarization, and (2) we propose a neural model to automatically summarize huge tweet streams. This extractive model combines in an original way pre-trained language models and vocabulary frequency-based representations to predict tweet salience. An additional advantage of the model is that it automatically adapts the size of the output summary according to the input tweet stream. We conducted experiments using two different Twitter collections, and promising results are observed in comparison with state-of-the-art baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2020

Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Text summarization is one of the most critical Natural Language Processi...
research
03/25/2019

Fine-tune BERT for Extractive Summarization

BERT, a pre-trained Transformer model, has achieved ground-breaking perf...
research
09/11/2021

Multilingual Translation via Grafting Pre-trained Language Models

Can pre-trained BERT for one language and GPT for another be glued toget...
research
10/14/2020

An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models

Recently, pre-trained language models like BERT have shown promising per...
research
12/20/2022

mFACE: Multilingual Summarization with Factual Consistency Evaluation

Abstractive summarization has enjoyed renewed interest in recent years, ...
research
04/06/2023

Automatic ICD-10 Code Association: A Challenging Task on French Clinical Texts

Automatically associating ICD codes with electronic health data is a wel...

Please sign up or login with your details

Forgot password? Click here to reset