DeepAI AI Chat
Log In Sign Up

FBERT: A Neural Transformer for Identifying Offensive Content

by   Diptanu Sarkar, et al.
Rochester Institute of Technology

Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT's performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.


page 1

page 2

page 3

page 4


GREEK-BERT: The Greeks visiting Sesame Street

Transformer-based language models, such as BERT and its variants, have a...

Vacaspati: A Diverse Corpus of Bangla Literature

Bangla (or Bengali) is the fifth most spoken language globally; yet, the...

A Large-Scale Semi-Supervised Dataset for Offensive Language Identification

The use of offensive language is a major problem in social media which h...

Language Variety Identification with True Labels

Language identification is an important first step in many IR and NLP ap...

RoBERTweet: A BERT Language Model for Romanian Tweets

Developing natural language processing (NLP) systems for social media an...

Does Dialog Length matter for Next Response Selection task? An Empirical Study

In the last few years, the release of BERT, a multilingual transformer b...