Neural Constituency Parsing of Speech Transcripts

04/17/2019
by   Paria Jamshid Lou, et al.
0

This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/08/2020

On the Role of Style in Parsing Speech with Neural Models

The differences in written text and conversational speech are substantia...
research
08/17/2020

Comparison of Syntactic Parsers on Biomedical Texts

Syntactic parsing is an important step in the automated text analysis wh...
research
02/23/2023

Prosodic features improve sentence segmentation and parsing

Parsing spoken dialogue presents challenges that parsing text does not, ...
research
12/15/2021

Penn-Helsinki Parsed Corpus of Early Modern English: First Parsing Results and Analysis

We present the first parsing results on the Penn-Helsinki Parsed Corpus ...
research
08/28/2018

Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data

This paper studies semantic parsing for interlanguage (L2), taking seman...
research
09/28/2022

Data-driven Parsing Evaluation for Child-Parent Interactions

We present a syntactic dependency treebank for naturalistic child and ch...

Please sign up or login with your details

Forgot password? Click here to reset