Linguistic Characteristics of Censorable Language on SinaWeibo

07/10/2018
by   Kei Yin Ng, et al.
0

This paper investigates censorship from a linguistic perspective. We collect a corpus of censored and uncensored posts on a number of topics, build a classifier that predicts censorship decisions independent of discussion topics. Our investigation reveals that the strongest linguistic indicator of censored content of our corpus is its readability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2021

Textual Analysis of Communications in COVID-19 Infected Community on Social Media

During the COVID-19 pandemic, people started to discuss about pandemic-r...
research
07/14/2017

Linguistic Markers of Influence in Informal Interactions

There has been a long standing interest in understanding `Social Influen...
research
05/15/2017

Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes

Debate and deliberation play essential roles in politics and government,...
research
07/15/2023

Three-way Decisions with Evaluative Linguistic Expressions

We propose a linguistic interpretation of three-way decisions, where the...
research
09/10/2023

Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning

We use large language models to aid learners enhance proficiency in a fo...
research
01/23/2020

Linguistic Fingerprints of Internet Censorship: the Case of SinaWeibo

This paper studies how the linguistic components of blogposts collected ...
research
08/30/2021

Linguistic Characterization of Divisive Topics Online: Case Studies on Contentiousness in Abortion, Climate Change, and Gun Control

As public discourse continues to move and grow online, conversations abo...

Please sign up or login with your details

Forgot password? Click here to reset