Multilingual Coreference Resolution in Multiparty Dialogue

08/02/2022
by   Boyuan Zheng, et al.
0

Existing multiparty dialogue datasets for coreference resolution are nascent, and many challenges are still unaddressed. We create a large-scale dataset, Multilingual Multiparty Coref (MMC), for this task based on TV transcripts. Due to the availability of gold-quality subtitles in multiple languages, we propose reusing the annotations to create silver coreference data in other languages (Chinese and Farsi) via annotation projection. On the gold (English) data, off-the-shelf models perform relatively poorly on MMC, suggesting that MMC has broader coverage of multiparty coreference than prior datasets. On the silver data, we find success both using it for data augmentation and training from scratch, which effectively simulates the zero-shot cross-lingual setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2022

MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation

Building dialogue generation systems in a zero-shot scenario remains a h...
research
06/11/2020

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

Multi-lingual contextualized embeddings, such as multilingual-BERT (mBER...
research
09/07/2021

GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation

Practical dialogue systems require robust methods of detecting out-of-sc...
research
09/20/2021

On Generalization in Coreference Resolution

While coreference resolution is defined independently of dataset domain,...
research
12/14/2022

Evaluating Byte and Wordpiece Level Models for Massively Multilingual Semantic Parsing

Token free approaches have been successfully applied to a series of word...
research
01/31/2022

Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation

Multilingual task-oriented dialogue (ToD) facilitates access to services...
research
07/26/2023

Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems

Creating high-quality annotated data for task-oriented dialog (ToD) is k...

Please sign up or login with your details

Forgot password? Click here to reset