HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding

08/19/2023
by   Hanzhuo Tan, et al.
0

Natural language understanding (NLU) is integral to various social media applications. However, existing NLU models rely heavily on context for semantic learning, resulting in compromised performance when faced with short and noisy social media content. To address this issue, we leverage in-context learning (ICL), wherein language models learn to make inferences by conditioning on a handful of demonstrations to enrich the context and propose a novel hashtag-driven in-context learning (HICL) framework. Concretely, we pre-train a model #Encoder, which employs #hashtags (user-annotated topic labels) to drive BERT-based pre-training through contrastive learning. Our objective here is to enable #Encoder to gain the ability to incorporate topic-related semantic information, which allows it to retrieve topic-related posts to enrich contexts and enhance social media NLU with noisy contexts. To further integrate the retrieved context with the source text, we employ a gradient-based method to identify trigger terms useful in fusing information from both sources. For empirical studies, we collected 45M tweets to set up an in-context NLU benchmark, and the experimental results on seven downstream tasks show that HICL substantially advances the previous state-of-the-art results. Furthermore, we conducted extensive analyzes and found that: (1) combining source input with a top-retrieved post from #Encoder is more effective than using semantically similar posts; (2) trigger words can largely benefit in merging context from the source and retrieved posts.

READ FULL TEXT
research
06/10/2019

Topic-Aware Neural Keyphrase Generation for Social Media Language

A huge volume of user-generated content is daily produced on social medi...
research
06/29/2020

A Framework for Pre-processing of Social Media Feeds based on Integrated Local Knowledge Base

Most of the previous studies on the semantic analysis of social media fe...
research
02/15/2022

Misinformation Detection in Social Media Video Posts

With the growing adoption of short-form video by social media platforms,...
research
06/12/2023

UniPoll: A Unified Social Media Poll Generation Framework via Multi-Objective Optimization

Social media platforms are essential outlets for expressing opinions, pr...
research
03/22/2020

Semantic-based End-to-End Learning for Typhoon Intensity Prediction

Disaster prediction is one of the most critical tasks towards disaster s...
research
07/23/2022

Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to Protect Privacy of Individuals on Twitter

The recent advances in natural language processing have yielded many exc...
research
10/06/2022

Time Will Change Things: An Empirical Study on Dynamic Language Understanding in Social Media Classification

Language features are ever-evolving in the real-world social media envir...

Please sign up or login with your details

Forgot password? Click here to reset