Personalised Language Modelling of Screen Characters Using Rich Metadata Annotations

03/29/2023
by   Sebastian Vincent, et al.
0

Personalisation of language models for dialogue sensitises them to better capture the speaking patterns of people of specific characteristics, and/or in specific environments. However, rich character annotations are difficult to come by and to successfully leverage. In this work, we release and describe a novel set of manual annotations for 863 speakers from the popular Cornell Movie Dialog Corpus, including features like characteristic quotes and character descriptions, and a set of six automatically extracted metadata for over 95 the featured films. We perform extensive experiments on two corpora and show that such annotations can be effectively used to personalise language models, reducing perplexity by up to 8.5 speakers for whom no prior training data is available, by relying on combinations of characters' demographic characteristics. Since collecting such metadata is costly, we also contribute a cost-benefit analysis to highlight which annotations were most cost-effective relative to the reduction in perplexity.

READ FULL TEXT
research
08/18/2023

ChatHaruhi: Reviving Anime Character in Reality via Large Language Model

Role-playing chatbots built on large language models have drawn interest...
research
07/29/2020

Dynamic Character Graph via Online Face Clustering for Movie Analysis

An effective approach to automated movie content analysis involves build...
research
10/16/2021

Metadata Shaping: Natural Language Annotations for the Tail

Language models (LMs) have made remarkable progress, but still struggle ...
research
11/18/2022

Metadata Might Make Language Models Better

This paper discusses the benefits of including metadata when training la...
research
10/23/2020

Identifying Similar Movie Characters Quickly but Effectively Using Non-exhaustive Pair-wise Attention

Identifying similar movie characters is a captivating task that can be o...
research
10/22/2022

LMPriors: Pre-Trained Language Models as Task-Specific Priors

Particularly in low-data regimes, an outstanding challenge in machine le...
research
07/10/2018

A modelling language for the effective design of Java annotations

This paper describes a new modelling language for the effective design o...

Please sign up or login with your details

Forgot password? Click here to reset