Research on oral storytelling over the last 100 years has distinguished at least two levels of narrative representation (1) story, or fabula: the content of a narrative in terms of the sequence of events and relations between them, the story characters and their traits and affects, and the properties and settings; and (2) discourse, or sujhet: the actual expressive telling of a story as a stream of words, gestures, images or facial expressions in a storytelling medium [2, 22, 7, 19, 20]. In the telling of a narrative, events from the story are selected, ordered, and expressed in the discourse. We use this distinction to create Fabula Tales, a computational framework for a virtual storyteller that can tell the same story in different ways, using a set of general narratological variations, such as direct vs. indirect speech, character voice (style), point of view, and focalization.
We demonstrate the generality of our methods by applying them to both Aesop’s Fables and personal narratives from a pre-existing corpus of blogs . We hypothesize many advantages for a virtual storyteller who can repurpose existing stories. Stories such as The Startled Squirrel in Fig. 1 are created daily in the thousands and cover any topic imaginable. They are natural and personal, and may be funny, sad, heart-warming or serious. Applications for virtual storytellers who can retell these stories in different ways could include virtual companions, persuasion, educational storytelling, or sharing troubles in therapeutic settings [3, 24, 18, 9, 23]. Fig. 2 shows how Fabula Tales can shift from third person to first person automatically using content from The Startled Squirrel (Fig.1). To our knowledge, this is the first time that these narratological variations have been implemented in a framework where the discourse (telling) is completely independent of the fabula (content) of the story .
This is one of those times I wish I had a digital camera. We keep a large stainless steel bowl of water outside on the back deck for Benjamin to drink out of when he’s playing outside. His bowl has become a very popular site. Throughout the day, many birds drink out of it and bathe in it. The birds literally line up on the railing and wait their turn. Squirrels also come to drink out of it. The craziest squirrel just came by- he was literally jumping in fright at what I believe was his own reflection in the bowl. He was startled so much at one point that he leap in the air and fell off the deck. But not quite, I saw his one little paw hanging on! After a moment or two his paw slipped and he tumbled down a few feet. But oh, if you could have seen the look on his startled face and how he jumped back each time he caught his reflection in the bowl!
|S1||The narrator placed the bowl on the deck in order for Benjamin to drink the bowl’s water. The bowl was popular. The birds drank the bowl’s water. The birds bathed themselves in the bowl. The birds organized themselves on the deck’s railing in order for the birds to wait.|
|S2||I approached the bowl. I was startled because I saw my reflection. I leaped because I was startled. I fell over the deck’s railing because I leaped because I was startled. I held the deck’s railing with my paw. My paw slipped off the deck’s railing. I fell.|
Sec. 2 describes how the deep structure of any narrative can be represented as a story intention graph, a generic model of the fabula . Sec. 3 describes our method for generating retellings of stories, and Sec. 4 describes two experimental evaluations. We delay discussion of related work to Sec. 5 when we can compare it to our own, and sum up and discuss future work.
2 Repurposing Stories with Story Intention Graphs
|A Crow was sitting on a branch of a tree with a piece of cheese in her beak when a Fox observed her and set his wits to work to discover some way of getting the cheese. Coming and standing under the tree he looked up and said, “What a noble bird I see above me! Her beauty is without equal, the hue of her plumage exquisite. If only her voice is as sweet as her looks are fair, she ought without doubt to be Queen of the Birds.” The Crow was hugely flattered by this, and just to show the Fox that she could sing she gave a loud caw. Down came the cheese,of course, and the Fox, snatching it up, said, “You have a voice, madam, I see: what you want is wits.”|
Our framework builds on Elson’s representation of fabula, called a story intention graph, or sig . The sig allows many aspects of a story to be captured, including key entities, events and statives arranged in a timeline, and an interpretation of the overarching goals, plans and beliefs of the story’s agents . Fig. 4 shows the part of the sig for The Startled Squirrel story in Fig. 1. Elson’s dramabank provides 36 Aesop’s Fables encoded as sigs, e.g. The Fox and the Crow in Fig. 3, and Elson’s annotation tool Scheherazade allows minimally trained annotators to develop a sig for any narrative. We hired an undergraduate linguist to use Scheherezade to produce sigs for 100 personal narratives. Each story took on average 45 minutes to annotate. We currently have 100 annotated stories on topics such as travel, daily activities, storms, gardening, funerals, going to the doctor, camping, and snorkeling.
Scheherazade allows users to annotate a story along several dimensions, starting with the surface form, or discourse as shown in Fig. 4, and then proceeding to deeper representations. The second column in Fig. 4 is called the “timeline layer”, in which the story facts are encoded as predicate-argument structures (propositions) and temporally ordered on a timeline. The timeline layer consists of a network of propositional structures, where nodes correspond to lexical items that are linked by thematic relations. Scheherazade adapts information about predicate-argument structures from the VerbNet lexical database  and uses WordNet  as its noun and adjectives taxonomy. The arcs of the story graph are labeled with discourse relations. Scheherazade also comes with a built-in realizer (referred to as sch in this paper) that the annotator can use to check their work. This realizer does not incorporate any narratological variations.
3 Generating Narratological Variations
Our framework can generate story re-tellings using methods that are neither genre nor domain-specific. We build Fabula Tales on two tools from previous work: personage and the ES-Translator [15, 21]. personage
is an expressive natural language generation engine that takes as input the syntactic formalism of Deep Syntactic Structures (dsynts) [12, 10]. dsynts allow personage to be flexible in generation, however the creation of dsynts has been hand crafted and time consuming. The ES-Translator (est) automatically bridges the narrative representation of the sig to the dsynts formalism by applying a model of syntax to the sig . The sig representation gives us direct access to the linguistic and logical representations of the fabula for each story, so the est can interpret the story in the dsynts formalism and retell it using different words or syntactic structures [21, 14].
dsynts are dependency structures where the nodes are labeled with lexemes and the arcs of the tree are labeled with syntactic relations. The dsynts formalism distinguishes between arguments and modifiers and between argument types (subject, direct and indirect object etc). personage handles morphology, agreement and function words to produce an output string.
After the est applies syntax to the sig, it generates two data structures: text plans containing sentence plans and the corresponding dsynts. Thus any story or content represented as a sig can be retold using personage. Fig. 5 provides a high level view of the architecture of est. The full translation methodology is described in .
This paper incorporates the est pipeline (including sigs and personage) into the Fabula Tales computational framework and adds three narratological parameters into story generation:
Point of View: change the narration point of view to any character in a story in the first person voice (Sec. 3.1.)
Direct Speech: given any sig encoding that uses speech act verbs (e.g. said, told, asked, alleged), re-tell as direct speech or indirect speech (Sec. 3.2.)
Character Voice: Substitute different character voices using any character model expressible with personage’s 67 parameters (Sec. 3.3.)
Fig. 6 provides variations that combine these narratological parameters illustrating content from “The Fox and the Crow” and two additional stories: Conflict at Work, and The Embarrassed Teacher. B2 and C1 are examples of the original tellings and C2 is a sch realization.
|Direct Speech||A1||Fox and Crow||The crow sat on the tree’s branch. The cheese was in the crow’s pecker. The crow thought “I will eat the cheese on the branch of the tree because the clarity of the sky is so-somewhat beautiful.”|
|Direct Speech||B1||Conflict at Work||“The company requires the division to sign the document”, the director told the division. “Be expedient”, the director told the division.|
|Original||B2||Conflict at Work||The new director sent out an email noting the urgency of everyone signing, scanning, and formatting the signed and scanned contract into a PDF. He noted that it had to be done that very day (a Friday).|
|Original||C1||Embarrassed Teacher||I had taken the register and was standing at the front of the class doing some revision… However, all eyes were not on my face but at my ankles. Nervously I looked down to see that my underslip had somehow made its way to the floor. Elastic gone What to do?.|
|Sch||C2||Embarrassed Teacher||The narrator lifted the slip and inserted it into a bottom drawer of the desk. The narrator resumed teaching, and the group of students didn’t react.|
|Indirect Speech||A2||Fox and the Crow||The fox said the beauty of the bird was incomparable. The fox said the hue of the feather of the bird was exquisite.|
|Indirect Speech||B3||Conflict at Work||The narrator said if the director said the thing was urgent the narrator would need to be urgent. The narrator said the director was frivolous.|
|Character Voice||A3||Fox and the Crow||The fox alleged “your beauty is quite incomparable, okay?” The fox alleged “your feather’s chromaticity is damn exquisite.”|
|Character Voice||C3||Embarrassed Teacher||I stood at the classroom’s front. I no-noticed my ankle to be somewhat observed. I looked nervously toward my ankle. I glanced around the students.|
|Point of View||A4||Fox and the Crow||I sat on the tree’s branch. The cheese was in my beak. The fox observed me. The fox came. The fox stood under the tree. The fox looked toward me. The fox said he saw me.|
3.1 Point of View
From the deep syntactic structure in the format of dsynts, we can change the narration style from the third person perspective to the first person perspective of any character in the story (see example A4 in Fig. 6). We define simple rules to make this transformation within the dsynts itself, not at the sentence level. Table 1 shows the dsynts, which are represented as xml structures, for the sentence The crow flew herself to the window.
In order to transform the sentence into the first person, only simple changes to the deep structure are necessary. At lines 9 and 10 in Table 1, we assign the person attribute to 1st to specify a change of point of view to first person. The surface realizer in personage takes care of the transformations with its own rules, knowing to change whatever lexeme is present at line 9 simply to I, and to change the coreference resolutions at line 10 to myself. This is a major advantage of our computational framework: the deep linguistic representation allows us to specify changes we want without manipulating strings, and allows general rules for narratological parameters such as voice.
3.2 Dialogue Realization
By default, speech acts in the sig are encoded as indirect speech. We automatically detect a speech act from its verb type in the WordNet online dictionary, and then transform it to a direct speech act (see A1, A2, B1, and B3 in Fig. 6). First we use WordNet to identify if the main verb in a sentence is a verb of communication. Next, we break apart the dsynts into their tree structure (Fig. 8). For example, we first identify the subject (director) from utterance B1 in Fig. 6, and object (division) of the main verb of communication (tell). Then we identify the remainder of the tree (be is the root verb), which is what is to be uttered, and split it off from its parent verb of communication node, thus creating two separate dsynts (Fig. 8). In personage, we create a direct speech text plan to realize the explanatory in the default narrator style and the utterance in a specified character voice and appropriately insert the quotation marks. We can then realize direct speech as “Utterance” said X. or X said “utterance.”
“The crow flew herself to the window”
1 <dsyntnode class="verb" lexeme="fly"> 2 <dsyntnode class="common_noun" lexeme="crow" gender="fem"> 3 <dsyntnode class="common_noun" lexeme="crow" gender="fem" pro="pro"> 4 <dsyntnode class="preposition" lexeme="to"> 5 <dsyntnode class="common_noun" lexeme="window"> 6 </dsyntnode> 7 </dsyntnode>
“I flew myself to the window”
8 <dsyntnode class="verb" lexeme="fly"> 9 <dsyntnode class="common_noun" lexeme="crow" gender="fem" person="1st"> 10 <dsyntnode class="common_noun" lexeme="crow" gender="fem" pro="pro" person="1st"> 11 <dsyntnode class="preposition" lexeme="to"> 12 <dsyntnode class="common_noun" lexeme="window"> 13 </dsyntnode> 14</dsyntnode>
3.3 Character Voice
The main advantage of personage is its ability to generate a single utterance in many different voices. Models of narrative style are currently based on the Big Five personality traits , or are learned from film scripts . Each type of model (personality trait or film) specifies a set of language cues, one of 67 different parameters, whose value varies with the personality or style to be conveyed. Previous work in  has shown that humans perceive the personality stylistic models in the way that personage intended, and  shows that character utterances in a new domain can be recognized by humans as models based on a particular film character.
After we add new rules to Fabula Tales to handle direct speech, we modified the original sig representation of the Fox and the Crow to contain more dialogue in order to evaluate a broader range of character styles, along with the use of direct speech. Table 2 shows a subset of parameters, which were used in the three personality models we tested here: the laid-back model for the fox’s direct speech, the shy model for the crow’s direct speech, and the neutral model for the narrator voice. The laid-back model uses emphasizers, hedges, exclamations, and expletives, whereas the shy model uses softener hedges, stuttering, and filled pauses. The neutral model is the simplest model that does not utilize any of the extremes of the personage parameters.
|Shy||Softener hedges||Insert syntactic elements (sort of, kind of, somewhat, quite, around, rather, I think that, it seems that, it seems to me that) to mitigate the strength of a proposition||‘It seems to me that he was hungry’|
|Stuttering||Duplicate parts of a content word||‘The vine hung on the tr-trellis’|
|Filled pauses||Insert syntactic elements expressing hesitancy (I mean, err, mmhm, like, you know)||‘Err… the fox jumped’|
|Laid Back||Emphasizer hedges||Insert syntactic elements (really, basically, actually) to strengthen a proposition||‘The fox failed to get the group of grapes, alright?’|
|Exclamation||Insert an exclamation mark||‘The group of grapes hung on the vine!’|
|Expletives||Insert a swear word||‘The fox was damn hungry’|
C3 in Fig. 6 provides an example of Fabula Tales rendering a story in a single voice for The Embarrassed Teacher. We tell the story from her point of view and give her an introverted voice. We also show that we can specify voices for characters in dialogue as in the Fable excerpt in A3 in Fig. 6. Fabula Tales system allows multiple personalities to be loaded and assigned to characters so that personage runs once, fully automatically, and in real-time.
4 Experimental Results
We present two experiments that show how the flexibility of the est combined with our narratological parameters to create Fabula Tales allows us to manipulate the perception of characters and story engagement and interest. We first present The Fox and the Crow with variations on direct speech and voice, followed by Embarrassed Teacher with variations on voice and point of view.
4.1 Perceptions of Voice and Direct Speech
We collect user perceptions of the The Fox and the Crow generated with direct speech and with different personality models (character voices) for each speech act. A dialogic variation plus character voice excerpt is A3 in Fig. 6. The dialogic story is told 1) only with the neutral model; 2) with the crow as shy and the fox as laid-back; and 3) with the crow as laid-back and the fox as shy.
Subjects are given a free text box and asked to enter as many words as they wish to use to describe the characters in the story. Table 3 shows the percentage of positive and negative descriptive words when categorized by LIWC . Some words include “clever” and “sneaky” for the laid-back and neutral fox, and “shy” and “wise” for the shy fox. The laid-back and neutral crow was pereived as “naíve” and “gullible” whereas the shy crow is more “stupid” and “foolish”.
Overall, the crow’s shy voice is perceived as more positive than the crow’s neutral voice, (ttest(12) = -4.38, p 0.0001), and the crow’s laid-back voice (ttest(12) = -6.32, p 0.0001). We hypothesize that this is because the stuttering and hesitations make the character seem more helpless and tricked, rather than the laid-back model which is more boisterous. However, there is less variation between the fox polarity. Both the stuttering shy fox and the boisterous laid-back fox were seen equally as “cunning” and “smart”. Although we don’t observe a difference between all characters, there is enough evidence to warrent further investigation of how reader perceptions change when the same content is realized in difference voices.
4.2 Perceptions of Voice and POV
In this experiment, we aim to see how different points of view and voices effect reader engagement and interest. We present readers with a one sentence summary of the Embarrassed Teacher story and 6 retellings of a sentence from the story, framed as “possible excerpts that could come from this summary”. We show retellings of a sentence from Embarrassed Teacher in first person neutral, first person shy, first person laid-back, third person neutral, the original story, and sch. We ask participants to rate each excerpt for their interest in wanting to read more of the story based on the style and information given in the excerpt, and to indicate their engagement with the story given the excerpt.
Means (M) and standard deviation (SD) for engagement and interest for original sentences and all variations in Perceptions of Voice and POV Experiment
Fig. 9 shows the means and standard deviation for engagement and interest ratings. We find a clear ranking for engagement: the original sentence is scored highest, followed by first outgoing, first neutral, first shy, sch, and third neutral.
shows the average engagement and interest for all the sentences. For engagement, paired t-tests show that there is a significant difference between original and first outgoing (ttest(94) = -3.99, p0.0001), first outgoing and first shy (ttest(94) = 3.71, p 0.0001), and first shy and sch (ttest(94) = 5.60, p 0.0001). However, there are no differences between first neutral and first outgoing (ttest(95) = -1.63, p 0.05), and sch and third neutral (ttest(94) = -0.31, p 0.38). We also performed an ANOVA and found there is a significant effect on style (F(1) = 224.24, p 0), sentence (F(9) = 5.49, p 0), and an interaction between style and sentence (F(9) =1.65, p 0.1).
For interest, we find the same ranking: the original sentence, first outgoing, first neutral, first shy, sch, and third neutral. Paired t-tests for interest show a significant difference between original and first outgoing (ttest(93) = 5.59, p 0.0001), and first shy and sch (ttest(93) = 6.16, p 0.0001). There is no difference between first outgoing and first neutral (ttest(93) = 0, p 0.5), first neutral and first shy (ttest(93) = 2.20, p 0.01), and sch and third neutral (ttest(93) = 0.54, p 0.29). We also performed an ANOVA and found there is a significant effect on style (F(1) = 204.08, p 0), sentence (F(9) = 7.32, p 0), and no interaction between style and sentence (F(9) =0.64, p 1).
We also find qualitative evidence that there are significant differences in reader’s interest and engagement in a story dependent only upon the style. Readers preferred to read this story in the first person: “[the] immediacy of first person … excerpts made me feel I was there”, “I felt as though those that had more detail and were from a personal perspective were more engaging and thought evoking versus saying the narrator did it”, and “I felt more engaged and interested when I felt like the narrator was speaking to me directly, as I found it easier to imagine the situation.” This further supports our hypothesis that our framework to change POV will effect reader perceptions.
Readers also identified differences in the style of the voice. Two readers commented about first outgoing: “The ‘oh I resumed…’ Feels more personal and is more engaging” and “curse words are used to express the severity of the situation wisely”. About first shy, “Adding the feeling of nervousness and where she looked made sense”. This suggests that certain styles of narration are more appropriate or preferred than others given the context of the story.
5 Discussion and Future Work
We introduce Fabula Tales, a computational framework for story generation that produces narratological variations of the same story from the fabula. We present examples showing that the capability we have developed is general, and can be applied to informal personal narratives. We present experiments showing that these novel narratological parameters lead to different perceptions of the story. Our approach builds on previous work which focused on generating variations of Aesop’s Fables such as The Fox and the Crow , however this previous work did not carry out perceptual studies.
Previous work has dubbed the challenges of generating different story tellings from fabula the NLG gap: an architectural disconnect between narrative generation (fabula) and natural language generation (sujet) [13, 4]. To our knowledge, there are only two previous lines of research that address the NLG gap. The storybook generator is an end-to-end narrative prose generation system that utilizes a primitive narrative planner along with a generation engine to produce stories in the Little Red Riding Hood fairy tale domain . This work manipulates NLG parameters such as lexical choice and syntactic structure, as well as narratological parameters such as person and focalization and the choice of whether to realize dialogue as direct or indirect speech. Similarly the IF system can generate multiple variations of text in an interactive fiction (IF) environment . The IF system (and its successor Curveship) uses a world simulator as the fabula, and renders narrative variations, such as different focalizations or temporal orders. However storybook can only generate stories in the domain of Little Red Riding Hood, and IF can only generate stories in its interactive fiction world. Other work implements narratological variations in the story planner and does not attempt to bridge the NLG gap .
In future work, we aim to further develop Fabula Tales and to test in more detail the perceptual effects of narratological variations on user interpretations of a story. Furthermore, we hope to learn when certain styles are preferred given the context in the sig.
Acknowledgments This research was supported by NSF Creative IT program grant #IIS-1002921, and a grant from the Nuance Foundation.
-  B.C. Bae, Y.G. Cheong, and R.M. Young. Toward a computational model of focalization in narrative. In Proc. of the 6th Int. Conf. on Foundations of Digital Games, pages 313–315. ACM, 2011.
-  M. Bal and E. Tavor. Notes on narrative embedding. Poetics Today, pages 41–59, 1981.
-  T.W. Bickmore. Relational agents: Effecting change through human-computer relationships. PhD thesis, MIT Media Lab, 2003.
-  C.B. Callaway and J.C. Lester. Narrative prose generation. Artificial Intelligence, 139(2):213–252, 2002.
-  D. Elson. Modeling Narrative Discourse. PhD thesis, 2012.
-  C. Fellbaum. Wordnet: An electronic lexical database. 1998. WordNet is available from http://www. cogsci. princeton. edu/wn, 2010.
-  G. Genette. Nouveau discours du récit. Éd. du Seuil, 1983.
-  A. Gordon and R. Swanson. Identifying personal stories in millions of weblog entries. In Third Int. Conf. on Weblogs and Social Media, Data Challenge Workshop, San Jose, CA, 2009.
-  J. Gratch, L.P. Morency, S. Scherer, G. Stratou, J. Boberg, S. Koenig, T. Adamson, A. Rizzo. User-state sensing for virtual health agents and telehealth applications. Studies in health technology and informatics, 184:151–157, 2012.
-  A. Mel’čuk. Dependency syntax: theory and practice. SUNY Press, 1988.
-  K. Kipper, A. Korhonen, N. Ryant, and M. Palmer. Extensive classifications of english verbs. In Proc. of the 12th EURALEX Int. Congress, pages 1–15, 2006.
B. Lavoie and O. Rambow.
A fast and portable realizer for text generation systems.
Procs of the 5th conference on Applied natural language processing, pages 265–268. ACL. 1997.
-  B. Lönneker. Narratological knowledge for natural language generation. In Proc. of the 10th European Workshop on Natural Language Generation (ENLG-05), pages 91–100. Citeseer, 2005.
-  S.M. Lukin, J.O. Ryan, and M.A. Walker. Automating direct speech variations in stories and games. 2014.
-  F. Mairesse and M.A. Walker. Controlling user perceptions of linguistic style: Trainable generation of personality traits. Computational Linguistics, 2011.
-  N. Montfort. Generating narrative variation in interactive fiction. University of Pennsylvania, 2007.
-  J.W. Pennebaker, M.E. Francis, and R.J. Booth. Linguistic inquiry and word count: Liwc 2001. Mahway: Lawrence Erlbaum Associates, 71:2001, 2001.
-  J.W. Pennebaker and J.D. Seagal. Forming a story: The health benefits of narrative. Journal of clinical psychology, 55(10):1243–1254, 1999.
-  G. Prince. A Grammar of Stories: An Introduction. Number 13. Walter de Gruyter, 1973.
-  V.I. Propp. Morphology of the Folktale, volume 9. University of Texas Press, 1968.
-  E. Rishes, S. Lukin, D.K. Elson, and M.A. Walker. Generating dierent story tellings from semantic representations of narrative. In Int. Conf. on Interactive Digital Storytelling, ICIDS’13, 2013.
-  V. Shklovsky. Theory of prose. Dalkey Archive Press, 1991.
-  M.D. Slater and D. Rouner. Entertainment education and elaboration likelihood: Understanding the processing of narrative persuasion. Communication Theory, 12(2):173–191, 2002.
-  D. Traum, A. Roque, A. L. P. Georgiou, J. Gerten, B. M. S. Narayanan, S. Robinson, and A. Vaswani. Hassan: A virtual human for tactical questioning. In Proc. of SIGDial, 2007.
-  M.A. Walker, R. Grant, J. Sawyer, G.I. Lin, N. Wardrip-Fruin, and M. Buell. Perceived or not perceived: Film character models for expressive nlg. In Int. Conf. on Interactive Digital Storytelling, ICIDS’11, 2011.