MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection

09/16/2021
by   Matthew Matero, et al.
10

Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens. However, modeling human language at higher-levels of context (i.e., sequences of messages) is under-explored. In stance detection and other social media tasks where the goal is to predict an attribute of a message, we have contextual data that is loosely semantically connected by authorship. Here, we introduce Message-Level Transformer (MeLT) – a hierarchical message-encoder pre-trained over Twitter and applied to the task of stance prediction. We focus on stance prediction as a task benefiting from knowing the context of the message (i.e., the sequence of previous messages). The model is trained using a variant of masked-language modeling; where instead of predicting tokens, it seeks to generate an entire masked (aggregated) message vector via reconstruction loss. We find that applying this pre-trained masked message-level transformer to the downstream task of stance detection achieves F1 performance of 67

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2022

Human Language Modeling

Natural language is generated by people, yet traditional language modeli...
research
05/14/2021

Classifying Long Clinical Documents with Pre-trained Transformers

Automatic phenotyping is a task of identifying cohorts of patients that ...
research
10/31/2022

Leveraging Pre-trained Models for Failure Analysis Triplets Generation

Pre-trained Language Models recently gained traction in the Natural Lang...
research
03/27/2021

Abuse is Contextual, What about NLP? The Role of Context in Abusive Language Annotation and Detection

The datasets most widely used for abusive language detection contain lis...
research
03/27/2023

Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models

Through exploiting a high level of parallelism enabled by graphics proce...
research
11/25/2019

Importance-Aware Learning for Neural Headline Editing

Many social media news writers are not professionally trained. Therefore...
research
11/10/2022

Probabilistic thermal stability prediction through sparsity promoting transformer representation

Pre-trained protein language models have demonstrated significant applic...

Please sign up or login with your details

Forgot password? Click here to reset