COLD: A Benchmark for Chinese Offensive Language Detection

01/16/2022
by   Jiawen Deng, et al.
0

Offensive language detection and prevention becomes increasing critical for maintaining a healthy social platform and the safe deployment of language models. Despite plentiful researches on toxic and offensive language problem in NLP, existing studies mainly focus on English, while few researches involve Chinese due to the limitation of resources. To facilitate Chinese offensive language detection and model evaluation, we collect COLDataset, a Chinese offensive language dataset containing 37k annotated sentences. With this high-quality dataset, we provide a strong baseline classifier, COLDetector, with 81 utilize the proposed COLDetector to study output offensiveness of popular Chinese language models (CDialGPT and CPM). We find that (1) CPM tends to generate more offensive output than CDialGPT, and (2) certain type of prompts, like anti-bias sentences, can trigger offensive outputs more easily.Altogether, our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models. Disclaimer: The paper contains example data that may be considered profane, vulgar, or offensive.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2023

Safety Assessment of Chinese Large Language Models

With the rapid popularity of large language models such as ChatGPT and G...
research
01/01/2023

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation

As natural language processing (NLP) for gender bias becomes a significa...
research
09/21/2023

A Chinese Prompt Attack Dataset for LLMs with Evil Content

Large Language Models (LLMs) present significant priority in text unders...
research
08/09/2023

CLEVA: Chinese Language Models EVAluation Platform

With the continuous emergence of Chinese Large Language Models (LLMs), h...
research
06/16/2023

Clickbait Detection via Large Language Models

Clickbait, which aims to induce users with some surprising and even thri...
research
05/24/2022

Garden-Path Traversal within GPT-2

In recent years, massive language models consisting exclusively of trans...
research
10/21/2022

SLING: Sino Linguistic Evaluation of Large Language Models

To understand what kinds of linguistic knowledge are encoded by pretrain...

Please sign up or login with your details

Forgot password? Click here to reset