DeepAI AI Chat
Log In Sign Up

Speech Detection Task Against Asian Hate: BERT the Central, While Data-Centric Studies the Crucial

06/05/2022
by   Xin Lian, et al.
0

With the epidemic continuing, hatred against Asians is intensifying in countries outside Asia, especially among the Chinese. Thus, there is an urgent need to detect and prevent hate speech toward Asians effectively. In this work, we first create COVID-HATE-2022, an annotated dataset that is an extension of the anti-Asian hate speech dataset on Twitter, including 2,035 annotated tweets fetched in early February 2022, which are labeled based on specific criteria, and we present the comprehensive collection of scenarios of hate and non-hate tweets in the dataset. Second, we fine-tune the BERT models based on the relevant datasets, and demonstrate strategies including 1) cleaning the hashtags, usernames being @, URLs, and emojis before the fine-tuning process, and 2) training with the data while validating with the "clean" data (and the opposite) are not effective for improving performance. Third, we investigate the performance of advanced fine-tuning strategies with 1) model-centric approaches, such as discriminative fine-tuning, gradual unfreezing, and warmup steps, and 2) data-centric approaches, which incorporate data trimming and data augmenting, and show that both strategies generally improve the performance, while data-centric ones outperform the others, which demonstrate the feasibility and effectiveness of the data-centric approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/12/2021

Fine-Tuning Transformers for Identifying Self-Reporting Potential Cases and Symptoms of COVID-19 in Tweets

We describe our straight-forward approach for Tasks 5 and 6 of 2021 Soci...
06/10/2020

Revisiting Few-sample BERT Fine-tuning

We study the problem of few-sample fine-tuning of BERT contextual repres...
12/07/2022

M3ST: Mix at Three Levels for Speech Translation

How to solve the data scarcity problem for end-to-end speech-to-text tra...
05/12/2022

Training Strategies for Own Voice Reconstruction in Hearing Protection Devices using an In-ear Microphone

In-ear microphones in hearing protection devices can be utilized to capt...