The Discussion Tracker Corpus of Collaborative Argumentation

05/22/2020
by   Christopher Olshefski, et al.
0

Although Natural Language Processing (NLP) research on argument mining has advanced considerably in recent years, most studies draw on corpora of asynchronous and written texts, often produced by individuals. Few published corpora of synchronous, multi-party argumentation are available. The Discussion Tracker corpus, collected in American high school English classes, is an annotated dataset of transcripts of spoken, multi-party argumentation. The corpus consists of 29 multi-party discussions of English literature transcribed from 985 minutes of audio. The transcripts were annotated for three dimensions of collaborative argumentation: argument moves (claims, evidence, and explanations), specificity (low, medium, high) and collaboration (e.g., extensions of and disagreements about others' ideas). In addition to providing descriptive statistics on the corpus, we provide performance benchmarks and associated code for predicting each dimension separately, illustrate the use of the multiple annotations in the corpus to improve performance via multi-task learning, and finally discuss other ways the corpus might be used to further NLP research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2021

Discussion Tracker: Supporting Teacher Learning about Students' Collaborative Argumentation in High School Classrooms

Teaching collaborative argumentation is an advanced skill that many K-12...
research
02/24/2023

VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining

In this paper, we describe VivesDebate-Speech, a corpus of spoken argume...
research
05/05/2017

Crowdsourcing Argumentation Structures in Chinese Hotel Reviews

Argumentation mining aims at automatically extracting the premises-claim...
research
04/11/2018

Multi-Task Learning for Argumentation Mining in Low-Resource Settings

We investigate whether and where multi-task learning (MTL) can improve p...
research
05/24/2023

Modeling Appropriate Language in Argumentation

Online discussion moderators must make ad-hoc decisions about whether th...
research
06/18/2022

Automatic Summarization of Russian Texts: Comparison of Extractive and Abstractive Methods

The development of large and super-large language models, such as GPT-3,...
research
11/03/2020

Semi-Supervised Cleansing of Web Argument Corpora

Debate portals and similar web platforms constitute one of the main text...

Please sign up or login with your details

Forgot password? Click here to reset