From News to Medical: Cross-domain Discourse Segmentation

04/14/2019
by   Elisa Ferracane, et al.
0

The first step in discourse analysis involves dividing a text into segments. We annotate the first high-quality small-scale medical corpus in English with discourse segments and analyze how well news-trained segmenters perform on this domain. While we expectedly find a drop in performance, the nature of the segmentation errors suggests some problems can be addressed earlier in the pipeline, while others would require expanding the corpus to a trainable size to learn the nuances of the medical domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2018

Toward Fast and Accurate Neural Discourse Segmentation

Discourse segmentation, which segments texts into Elementary Discourse U...
research
03/18/2021

Evaluating Document Coherence Modelling

While pretrained language models ("LM") have driven impressive gains ove...
research
04/13/2017

Cross-lingual and cross-domain discourse segmentation of entire documents

Discourse segmentation is a crucial step in building end-to-end discours...
research
03/31/2015

Towards Using Machine Translation Techniques to Induce Multilingual Lexica of Discourse Markers

Discourse markers are universal linguistic events subject to language va...
research
02/13/2023

Why Can't Discourse Parsing Generalize? A Thorough Investigation of the Impact of Data Diversity

Recent advances in discourse parsing performance create the impression t...
research
04/23/2019

GumDrop at the DISRPT2019 Shared Task: A Model Stacking Approach to Discourse Unit Segmentation and Connective Detection

In this paper we present GumDrop, Georgetown University's entry at the D...
research
01/02/2021

Multitask Learning for Class-Imbalanced Discourse Classification

Small class-imbalanced datasets, common in many high-level semantic task...

Please sign up or login with your details

Forgot password? Click here to reset