UoB at SemEval-2020 Task 12: Boosting BERT with Corpus Level Information

08/19/2020
by   Wah Meng Lim, et al.
0

Pre-trained language model word representation, such as BERT, have been extremely successful in several Natural Language Processing tasks significantly improving on the state-of-the-art. This can largely be attributed to their ability to better capture semantic information contained within a sentence. Several tasks, however, can benefit from information available at a corpus level, such as Term Frequency-Inverse Document Frequency (TF-IDF). In this work we test the effectiveness of integrating this information with BERT on the task of identifying abuse on social media and show that integrating this information with BERT does indeed significantly improve performance. We participate in Sub-Task A (abuse detection) wherein we achieve a score within two points of the top performing team and in Sub-Task B (target detection) wherein we are ranked 4 of the 44 participating teams.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2019

DocBERT: BERT for Document Classification

Pre-trained language representation models achieve remarkable state of t...
research
04/28/2020

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection

Nowadays, offensive content in social media has become a serious problem...
research
02/08/2022

HistBERT: A Pre-trained Language Model for Diachronic Lexical Semantic Analysis

Contextualized word embeddings have demonstrated state-of-the-art perfor...
research
07/28/2020

GUIR at SemEval-2020 Task 12: Domain-Tuned Contextualized Models for Offensive Language Detection

Offensive language detection is an important and challenging task in nat...
research
10/18/2020

Incorporating Count-Based Features into Pre-Trained Models for Improved Stance Detection

The explosive growth and popularity of Social Media has revolutionised t...
research
02/24/2022

Finding Inverse Document Frequency Information in BERT

For many decades, BM25 and its variants have been the dominant document ...
research
03/01/2021

Deep Bag-of-Sub-Emotions for Depression Detection in Social Media

This paper presents the Deep Bag-of-Sub-Emotions (DeepBoSE), a novel dee...

Please sign up or login with your details

Forgot password? Click here to reset