Attend and Select: A Segment Attention based Selection Mechanism for Microblog Hashtag Generation

06/06/2021
by   Qianren Mao, et al.
0

Automatic microblog hashtag generation can help us better and faster understand or process the critical content of microblog posts. Conventional sequence-to-sequence generation methods can produce phrase-level hashtags and have achieved remarkable performance on this task. However, they are incapable of filtering out secondary information and not good at capturing the discontinuous semantics among crucial tokens. A hashtag is formed by tokens or phrases that may originate from various fragmentary segments of the original text. In this work, we propose an end-to-end Transformer-based generation model which consists of three phases: encoding, segments-selection, and decoding. The model transforms discontinuous semantic segments from the source text into a sequence of hashtags. Specifically, we introduce a novel Segments Selection Mechanism (SSM) for Transformer to obtain segmental representations tailored to phrase-level hashtag generation. Besides, we introduce two large-scale hashtag generation datasets, which are newly collected from Chinese Weibo and English Twitter. Extensive evaluations on the two datasets reveal our approach's superiority with significant improvements to extraction and generation baselines. The code and datasets are available at <https://github.com/OpenSUM/HashtagGen>.

READ FULL TEXT
research
05/18/2019

Microblog Hashtag Generation via Encoding Conversation Contexts

Automatic hashtag annotation plays an important role in content understa...
research
01/30/2022

Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection

Nowadays, most methods in end-to-end contextual speech recognition bias ...
research
05/20/2020

Applying the Transformer to Character-level Transduction

The transformer has been shown to outperform recurrent neural network-ba...
research
06/25/2020

Learning Source Phrase Representations for Neural Machine Translation

The Transformer translation model (Vaswani et al., 2017) based on a mult...
research
08/27/2020

Improvement of a dedicated model for open domain persona-aware dialogue generation

This paper analyzes some speed and performance improvement methods of Tr...
research
09/14/2020

Contrastive Triple Extraction with Generative Transformer

Triple extraction is an essential task in information extraction for nat...
research
03/07/2022

FloorGenT: Generative Vector Graphic Model of Floor Plans for Robotics

Floor plans are the basis of reasoning in and communicating about indoor...

Please sign up or login with your details

Forgot password? Click here to reset