An Automated Pipeline for Character and Relationship Extraction from Readers' Literary Book Reviews on Goodreads.com

04/20/2020
by   Shadi Shahsavari, et al.
0

Reader reviews of literary fiction on social media, especially those in persistent, dedicated forums, create and are in turn driven by underlying narrative frameworks. In their comments about a novel, readers generally include only a subset of characters and their relationships, thus offering a limited perspective on that work. Yet in aggregate, these reviews capture an underlying narrative framework comprised of different actants (people, places, things), their roles, and interactions that we label the "consensus narrative framework". We represent this framework in the form of an actant-relationship story graph. Extracting this graph is a challenging computational problem, which we pose as a latent graphical model estimation problem. Posts and reviews are viewed as samples of sub graphs/networks of the hidden narrative framework. Inspired by the qualitative narrative theory of Greimas, we formulate a graphical generative Machine Learning (ML) model where nodes represent actants, and multi-edges and self-loops among nodes capture context-specific relationships. We develop a pipeline of interlocking automated methods to extract key actants and their relationships, and apply it to thousands of reviews and comments posted on Goodreads.com. We manually derive the ground truth narrative framework from SparkNotes, and then use word embedding tools to compare relationships in ground truth networks with our extracted networks. We find that our automated methodology generates highly accurate consensus narrative frameworks: for our four target novels, with approximately 2900 reviews per novel, we report average coverage/recall of important relationships of > 80 frameworks can generate insight into how people (or classes of people) read and how they recount what they have read to others.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2021

Modeling Social Readers: Novel Tools for Addressing Reception from Online Book Reviews

Readers' responses to literature have received scant attention in comput...
research
10/18/2021

Analyzing Wikipedia Membership Dataset and PredictingUnconnected Nodes in the Signed Networks

In the age of digital interaction, person-to-person relationships existi...
research
10/08/2019

Peer Reviewing Revisited: Assessing Research with Interlinked Semantic Comments

Scientific publishing seems to be at a turning point. Its paradigm has s...
research
05/09/2019

Detecting Vietnamese Opinion Spam

Recently, Vietnamese Natural Language Processing has been researched by ...
research
07/15/2022

Partial Disentanglement via Mechanism Sparsity

Disentanglement via mechanism sparsity was introduced recently as a prin...

Please sign up or login with your details

Forgot password? Click here to reset