Classifying Constructive Comments

by   Varada Kolhatkar, et al.

We introduce the Constructive Comments Corpus (C3), comprised of 12,000 annotated news comments, intended to help build new tools for online communities to improve the quality of their discussions. We define constructive comments as high-quality comments that make a contribution to the conversation. We explain the crowd worker annotation scheme and define a taxonomy of sub-characteristics of constructiveness. The quality of the annotation scheme and the resulting dataset is evaluated using measurements of inter-annotator agreement, expert assessment of a sample, and by the constructiveness sub-characteristics, which we show provide a proxy for the general constructiveness concept. We provide models for constructiveness trained on C3 using both feature-based and a variety of deep learning approaches and demonstrate that these models capture general rather than topic- or domain-specific characteristics of constructiveness, through domain adaptation experiments. We examine the role that length plays in our models, as comment length could be easily gamed if models depend heavily upon this feature. By examining the errors made by each model and their distribution by length, we show that the best performing models are less correlated with comment length.The constructiveness corpus and our experiments pave the way for a moderation tool focused on promoting comments that make a contribution, rather than only filtering out undesirable content.


Constructive and Toxic Speech Detection for Open-domain Social Media Comments in Vietnamese

The rise of social media has led to the increasing of comments on online...

Identifying High-Quality Chinese News Comments Based on Multi-Target Text Matching Model

With the development of information technology, there is an explosive gr...

BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection

Toxic comments in online platforms are an unavoidable social issue under...

Not All Comments are Equal: Insights into Comment Moderation from a Topic-Aware Model

Moderation of reader comments is a significant problem for online news p...

Placing M-Phasis on the Plurality of Hate: A Feature-Based Corpus of Hate Online

Even though hate speech (HS) online has been an important object of rese...

Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models

Deceptive news posts shared in online communities can be detected with N...

Building a Pilot Software Quality-in-Use Benchmark Dataset

Prepared domain specific datasets plays an important role to supervised ...

Please sign up or login with your details

Forgot password? Click here to reset