1 Introduction
From automated teller machines (ATMs), airport and train kiosks, apps for smart phones and similar devices to wizards (e.g., Microsoft Word) and intelligent tutoring/training (e.g., SAT, Rosetta Stone, military), humancomputer dialogs^{1}^{1}1A dialog in this context refers to any series of interactions between a user and a computer system, not necessarily through a verbal modality. For instance, a user completing an online mortgage application participates in a humancomputer dialog. are woven into the fabric of our daily interactions with computer systems. While supporting flexibility in dialog is essential to deliver a natural experience to the user, it makes the implementation challenging due to the numerous directions in which a user might desire to steer a dialog, all of which must be captured in an implementation. This problem is difficult since dialogs range in complexity from those modeled after a simple predefined series of questions and answers to those which give the user a great deal of control over the direction in which to steer the dialog. In this article, we discuss a computational model, based on concepts from programming languages, and especially partial evaluation, for specifying and staging dialogs. The objective of this work is to enrich and demonstrate the feasibility of an alternate method of modeling humancomputer dialogs.
This article is organized as follows. To introduce the reader to the wide range of dialogs possible, and provide a better feel for this problem and its difficulty, we first present some illustrative examples of dialogs. In this section we simply showcase a variety of dialogs and describe how they can be specified using formal notation, rather than discuss their implementation and related issues which are covered later. In Section 2 we describe a programming languages notation for specifying dialogs. Section 3 demonstrates how dialogs can be staged with partial evaluation, while Section 4 outlines how to mine dialog specifications in a programming languages notation from dialog requirements, and how we automatically generate stagers from those specifications in programming languages notation. Section 5 summarizes our contributions and discusses future work.
1.1 Fixed and Mixedinitiative Dialogs
Consider a dialog to purchase gasoline using a credit card. The customer must first swipe the card, then chose a grade (of octane), and finally indicate whether they desire a receipt. Such a dialog is said to be a fixed dialog due to the fixed order of the questions from which the user is not permitted to deviate in her responses [1]. An enumerated specification of this dialog is {creditcard grade receipt}. An enumerated specification is a set of episodes, and an episode is an ordered list of questions to be posed and answered from the start of the dialog through dialog completion. Intuitively, a specification is a complete set all possible ways to complete a dialog. We can think of a dialog specification as a set of totally ordered sets or chains. We use a Hasse diagram, a graphical and concise depiction of a partially ordered set, to represent a dialog specification. A relation with the set over whose Cartesian product is defined is a partially ordered set (or poset) if is a reflexive, antisymmetric, and transitive relation. This means that some of the elements of may be unordered based on the relation . On the other hand, a set is a totally ordered set according to a relation iff for every two elements , or . Every totally ordered set is also a partially ordered set, but the reverse is not necessarily true.
Fig. 1a illustrates the Hasse diagram for this gas dialog specification. A Hasse diagram is read bottomup. Here, the set of the poset is the set of the questions posed in the dialog and of the poset is the ‘must be answered before’ relation denoted with an upward arrow between the source and target of the arrow.
Flexible dialogs typically support multiple completion paths. For instance, consider a dialog for ordering coffee. The participant must select a size and blend, and indicate whether room for cream is desired. Since possible responses to these questions are completely independent of each other, the dialog designer may wish to permit the participant to communicate the answers in any combinations and in any order. For example, some customers may prefer to use a size blend cream episode:
System Please select a size. User Small. System Please select a blend. User Dark. System Please specify whether you want room for cream. User No.
while others may use a blend cream size episode:
System Please select a size. User Mild. System Okay mild blend. Please select a size. User With room for cream. System Okay with room for cream. Please select a size. User Large.
while still other might like to use a (size blend) cream episode, where answers to the questions enclosed in parentheses must be communicated in a single utterance:
System Please select a size. User Small, french roast. System Please specify whether you want room for cream. User No.
Therefore, to accommodate all possibilities we specify this dialog as:
{(size blend cream),  (size blend) cream,  cream (size blend), 
(blend cream) size,  size (blend cream),  (size cream) blend, 
blend (size cream),  size blend cream,  size cream blend, 
blend size cream,  blend cream size,  cream blend size, 
cream size blend}. 
Notice that this specification indicates that answers, to the set of questions in the dialog, may be communicated in utterances corresponding to all possible set partitions of the set of questions. Moreover, all possible permutations of those partitions are specified as well. The Hasse diagram for this dialog is given in Fig. 1e. The absence of arrows between the size, blend, cream, (size blend), (size cream), (blend cream), and (size blend cream) elements indicates that the times at which each of those utterances may be communicated are unordered. Notice that the Hasse diagram is a compressed (and, thus, optimal) representation capturing the requirements in the specification. Moreover, the compression is lossless (i.e., the episodes in the enumerated specification may be reconstructed from the diagram).
Giving the user more flexibility in how to proceed through a dialog increases the number of episodes in its enumerated specification. This coffee ordering dialog is a mixedinitiative dialog [1]. There are multiple tiers of mixedinitiative interaction. The tier considered in this article is called unsolicited reporting—an interaction strategy where, in response to a question, at any point in the dialog, the user may provide an unsolicited response to a forthcoming question.
Only a single response  Multiple responses  
per utterance  per utterance  
Only one  Confirmation dialog boxes  Online forms with 
utterance  common in application software  multiple fields 
Totally  Purchasing gasoline with a  Providing a telephone, 
ordered  credit card; buying beverages  credit card, or PIN number 
from a vending machine  through voice  
Partially  ATMs, and airport or train kiosks  Ordering a coffee or pizza 
ordered 
When all possible permutations (i.e., orders) of all possible partitions (i.e., combinations) of responses to questions are supported, we call the dialog a complete, mixedinitiative dialog. Fig. 1 represents a space from fixed dialogs to complete, mixedinitiative dialogs, encompassing a wide variety of possible unsolicited reporting, mixedinitiative dialogs. Table 1 identifies some practical, everyday dialogs which fall into the cross product of permutations and partitions of responses to questions.
Fig. 2 provides an overview of this research project. We start with an enumerated dialog specification (i.e., a set of episodes) and mine it for a compressed representation of the dialog in a programming languages notation which capture the requirements of the dialog (transition from the left to the center of Fig. 2)—a process we call dialog mining. From that intermediate, implementationneutral representation we automatically generate a dialog stager capable of realizing the dialog or, in other words, staging the interaction (transition from the center to the right of Fig. 2).
1.2 Spectrum of Dialogs
Fixed and complete, mixedinitiative dialogs each represent an opposite end of this spectrum of unsolicited reporting dialogs as shown in Fig. 1. There are several dialogs between those two ends. For instance, consider a specification for an ATM dialog where PIN and amount must be entered first and last, respectively, but the transaction type (deposit or withdrawal) and account type (checking or savings) can be communicated in any order (see Fig. 1b):
{PIN transaction account amount, PIN account transaction amount}.
This dialog contains an embedded, mixedinitiative subdialog [1].
Alternatively, consider a dialog for ordering lunch where requesting a receipt or indicating whether you are diningin or takingout can be communicated either first or last, but specification of sandwich and beverage must occur in that order:
{receipt sandwich beverage dinein/takeout, dinein/takeout sandwich beverage receipt}. 
This dialog contains an embedded, fixed subdialog and, unlike the prior examples, cannot be captured by a single poset (see Fig. 1c).
Lastly, consider a dialog containing two embedded, complete, mixedinitiative subdialogs [16] (see Fig. 1d):
{cream sugar eggs toast,  cream sugar toast eggs,  (cream sugar) toast eggs, 
(cream sugar) eggs toast,  sugar cream eggs toast,  sugar cream toast eggs, 
eggs toast cream sugar,  eggs toast sugar cream,  toast eggs cream sugar, 
toast eggs sugar cream,  sugar cream (eggs toast),  cream sugar (eggs toast), 
(eggs toast) (cream sugar),  (cream sugar) (eggs toast)}. 
Here, the user can specify coffee and breakfast choices in any order, and can specify the subparts of coffee and breakfast in any order, but cannot mix the atomic responses of the two (i.e., episodes such as cream eggs sugar toast are not permitted).
There are two assumptions we make on this spectrum of unsolicited reporting, mixedinitiative dialogs in this article: i) each episode in a specification has a consistent length (i.e., number of questions) and ii) the permissible responses for each question are completely independent of each other (i.e., no response to a question ever precludes a particular response to another question).
2 Specifying Dialogs in a Programming Languages Notation
There is a combinatorial explosion in the number of possible dialogs between the fixed and complete, mixedinitiative ends of the spectrum in Fig. 1 Specifically, the number of dialogs possible in this space is (i.e., all possible subsets, save for the empty set, of all episodes in a complete, mixedinitiative dialog), where represents the enumerated specification of a complete, mixedinitiative dialog given , the number of questions posed in the dialog. In this section we bring structure to this space by viewing these dialogs through a programming languages lens for insight into staging them. We start by describing how to specify these dialogs using a programming languages notation which involves a variety of concepts from programming languages.
2.1 Dialog Types
Concept  Function  Type signature  calculus  

apply  :  =  
curry  :  =  
papply  :  =  
papplyn  :  =  
=  
=  
smix  :  =  
=  
=  
mix  :  =  
=  
=  
=  
=  
=  
= 



{(size blend cream)}  
{size blend cream}  
{size (blend cream)}  
{(size blend cream), (size (blend cream),  
(size blend) cream}  
{(size blend cream), size (blend cream),  
(size blend) cream, size blend cream}  
{size (blend cream), blend (size cream),  
cream (size blend)}  
{size blend cream, size cream blend,  
blend size cream, blend cream size,  
cream blend size, cream size blend}  
{(size blend cream), size (blend cream),  
blend (size cream), cream (size blend),  
(size blend) cream, (size cream) blend,  
(blend cream) size}  
{(size blend cream), (size blend) cream,  
cream (size blend), (blend cream) size,  
size (blend cream), (size cream) blend,  
blend (size cream), size blend cream,  
size cream blend, blend size cream,  
blend cream size, cream blend size,  
cream size blend} 
In this notation a dialog is specified by an expression of the form , where represents a concept from programming languages and is a list representing the questions in the dialog (being specified). The main idea in this notation is that the set of episodes specified by an expression of this form correspond to all possible ways that a function parameterized by the questions in the denominator can be partially applied, and repartially applied, and so on, according to the concept in the numerator. This notation was introduced in [2] and revisited in [11]. Here, we enrich it with additional language concepts and modify its semantics.
The concepts from programming languages in this model are interpretation (), currying (), partial function application (), partial function application (), singleargument partial evaluation (), and partial evaluation (). These concepts correspond to higherorder functions which each take a function and some subset of its parameters as arguments. The type signatures for the functions of this model are given in Table 2. We assume readers are familiar with interpretation [3], currying [6], and partial evaluation [7]. Partial function application, papply, takes a function and its first argument and returns a function accepting the remainder of the parameters. The function papplyn, on the other hand, takes a function and all of the first of arguments to where , and returns a function accepting the remainder of the parameters. Notice that with singleargument partial evaluation, the function may be partially evaluated with only one argument at a time. All of these functions except apply return a function. Table 3 provides definitions of papply, papplyn, and smix in Scheme.
These functions are general in that they accept a function of any arity as input. The functions curry, papply, papplyn, smix, and mix are closed (i.e., they take a function as input and return a function as output). Here, we are interested in a progressive series of applications of each of these functions which terminate at a fixpoint. Therefore, we superscript a type with a , where applicable, to indicate a progressive series of applications of the corresponding function ending in a fixpoint. For instance, repeatedly applying papplyn to a ternary function (e.g., (apply (papplyn (papplyn f small) mild) no) realizes the episode size blend cream in addition to the size (blend cream), (size blend) cream, and (size blend cream) episodes which are realized with only a single application of papplyn. Note that (i.e., the function returned from the partial application of a curried function is in curried form; there is no need to recurry it) and since apply does not return a function. Therefore, we can superscript , , , and with a symbol.
Only a single response  Multiple responses  
per utterance  per utterance  
Only one utterance  interpretation ()  
Totally ordered  currying ()  partial function application () 
Partially ordered  singlearg. partial eval. ()  partial eval. () 
The right side of Table 4 shows enumerated specifications of dialogs for ordering coffee. Each dialog also can be specified using one of the dialog types presented here. The left side of Table 4 shows how those dialogs are specified using this programming languages notation. We associate a fixed dialog with currying () (second row of Table 4) and a complete, mixedinitiative dialog with partial evaluation () (bottommost row of Table 4). The types (and combinations of them) in Table 4 help specify dialogs between the fixed and complete, mixedinitiative ends of this unsolicited reporting spectrum. Note that the order of the terms in the denominator matters (i.e., ). Also, note that when the number of questions posed in a dialog is less than three, the , , and types specify the same episodes (e.g., {a b}). Table 5 associates types of dialogs, along permutations and partitions (of questions) axes, to some of the concepts from programming languages in this model, and helps connect the dialogs in Table 1 with the concepts used to specify them. The type is introduced below.
Note that there is always only one episode possible in any dialog specified using only one of the , , or types. There are always episodes in any dialog specified using only one of the and types, where is the number of questions posed in a dialog. The number of episodes in any dialog conforming to any of the , , , and dialog types as a function of is , , , and , respectively. The episodes in any dialog specified using only one of the , , , , , , , , or types are related to each other in multiple ways. For instance, by definition of the symbol here, , where is any dialog type (e.g., or ). Other relationships include
,
,
,
,
= , and
.
Lastly, meaning that the type subsumes all others. The implication of this, as we see in the following section, is that any dialog conforming to one of these types can be supported through partial evaluation.
2.2 Subdialogs
We denote the space of unsolicited reporting, mixedinitiative dialogs shown in Fig. 1 with the symbol . Here denotes a particular type of dialog (e.g., or ) which specifies a set of episodes, while denotes a class of dialogs (e.g., or ), where a class is defined as a set of dialog specifications of type based on , the number of questions posed in a dialog. The number of dialogs possible in this space is , and there are dialogs^{2}^{2}2For , . The cases where and are the only cases where . This is because when , the one specification in each of the individual classes is the same in each class. Similarly, when
some specifications are multiclassified.
in the spectrum shown in Fig. 1 which cannot be specified with a single type (e.g. dialogs b, c, and d in Fig. 1 and Table 6). We call the class containing these dialogs . There are notable observations on the space : a) its classes are totally disjoint, b) the , , , , and classes always contain only one specification independent of , c) the class always contains specifications because there exists one specification per question, where the response to that question is supplied first and the responses to all remaining questions arrive next in one utterance, d) the , , and classes always contain specifications as each contains one specification per each episode in a dialog type (in Table 4 and introduced below), and e) therefore, the number of dialogs specifiable with only a single type is . However, this programming languages notation for dialog specification is expressive enough to specify the dialogs in the class because those dialogs can be expressed as a union of types (e.g., dialog c in Table 6, or {(x y z), x (y z)}) or expressed as dialogs involving subdialogs through the use of nesting [2] (e.g., dialogs b and d in Table 6), or both (e.g., dialog f in Table 6).(a)  

(b)  
(c)  
(d)  
(e)  
(f) 
In dialogs containing more than one subdialog in the denominator, the , , , , and types are not candidates for the numerator because these types imply multiple responses per utterance and it is not possible to complete more than one subdialog in a single utterance. The and types suffice for dialogs with no more than two subdialogs (e.g., and ) because when used as the numerator in an expression whose denominator contains more than two terms they also imply multiple responses per utterance. Hence, is the only type which can always contain any number of subdialogs in the denominator. However, cannot be used in the numerator of a dialog specification where there are two or more subdialogs in the denominator which can be completed in any order. Thus, we need a type which restricts utterances to one response but also permits all possible completion orders. We call this type , and
{size blend cream, size cream blend, blend size cream, 
blend cream size, cream blend size, cream size blend}. 
Note that , and .
2.3 Rewrite Rules
Types and are primitive in that any dialog in this space can be specified using only and . In particular, to specify any dialog in this notation we can simply translate each episode in an enumerated specification as a expression and the entire specification as a union of those expressions. For instance, {x y z, y z x, z x y}. Furthermore, all dialogs specified using this notation can be reduced to a dialog using only the and types. For example, {x (y z)}. Therefore, we can define rewrite rules akin in spirit to those in [11].
Specifying dialogs in this spectrum (shown in Fig. 1) with a programming languages notation has multiple effects: a) it helps bring structure to the space between the two ends of the spectrum, b) it helps us losslessly compress the episodes in an enumerated specification of a dialog without enumerating all of the episodes (to capture the possible orders and combinations of responses) therein and, therefore, provides a shorthand notation for dialog specification, akin to the Hasse diagram method, and c) a dialog specified in this notation provides a design for implementing the dialog, as we see below.
Use of concepts from programming languages, such as interpretation, currying, and partial evaluation, to specify dialogs has been established in [16, 2, 11]. What we have presented here is a modification of the notation from [11]. Specifically, we have added types to enrich the notation and redefined types to more accurately reflect the concepts to which they are associated.
3 Staging Dialogs by Partial Evaluation
Staging the dialog episode size blend cream  

by partial evaluation ().  
[

= 


[ ⬇ (lambda (blend cream) (if (member? blend <blends>) (if (member? cream <cream>) (retrieve item))))) , blend=mild]  = 


[ ⬇ (lambda (cream) (if (member? cream <cream>) (retrieve item)))) , cream=no]  = 


Staging the dialog episode blend size cream  
by partial evaluation ().  
[

= 


[ ⬇ (lambda (size cream) (if (member? size <sizes>) (if (member? cream <cream>) (retrieve item))))) , size=large]  = 


[ ⬇ (lambda (cream) (if (member? cream <cream>) (retrieve item)))) , cream=yes]  = 

Staging the dialog episode size blend cream by partial evaluation ().  
[
[
[


Staging the dialog episode blend size cream by partial evaluation ().  
[
[
[

,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
,  
Since partial evaluation can be used to partially apply a function with respect to any subset its parameters (i.e., it supports the partial application of a function with all possible orders and combinations of its arguments), we can stage any unsolicited reporting, mixedinitiative dialog in this space using only partial evaluation. In other words, partial evaluation is a generalization of any dialog type or, alternatively, each dialog type, except , represents a particular type of restriction on partial evaluation.
We use an example to illustrate how dialogs can be staged with partial evaluation. Consider the ternary Scheme function f shown in Listing 1. We call a function such as this, which we partially evaluate to stage a dialog, a script. Assume that we wrote this function without the intent of ever invoking it, and rather only with the intent of automatically transforming it for effect. This effect is staging the interaction of a humancomputer dialog. While f is a function, here we only think of it as only a malleable and disposable data object. When that input data is expired, the dialog is complete. In this model, f is only data.
The top half of Table 7 demonstrates how the size blend cream episode is staged by this process. We use the symbol from [7] to denote the partial evaluation operation because partial evaluation involves a mixture of interpretation and code generation. The operator accepts two arguments: a function (to be partially evaluated) and a static assignment of values to a subset its parameters. The semantics of the expression in the notation from [7] are invoke f on 3 or f(3). Consider a function pow which accepts a base and an exponent (in that order) as arguments and returns the base raised to the exponent. The semantics of the expression are partially evaluate pow with respect to exponent equal to two. This returns (i.e., a squaring function) which accepts only a base. Therefore, . Given a ternary function f with integer parameters x, y, and z:
.
In general,
.
This same function f can be used to realize a completely different episode than the one after which it is modeled. The bottom half of Table 7 demonstrates how the blend size cream episode can be staged by this process, with the same function f. While f reflects only one episode (in this case, size blend cream), by partial evaluating f we can stage the interaction required by 13 distinct episodes. In general, by partially evaluating a script representing only one episode, we can realize distinct episodes. This ‘model one episode, stage multiple’ aspect of this approach is a significant aspect of this research.
While Table 7 shows how f is transformed after each progressive partial evaluation in the process, Table 8 omits these intermediary outputs and, thus, provides an alternate view of Table 7. The scripts being partially evaluated in Tables 7 and 8 omit else (exceptional) branches (e.g., (invalidresponse) in Listing 1) for purposes of succinct exposition and conservation of space. Notice from Table 7 that, at any point in the interaction, a script always explicitly models the questions which remain unanswered and, therefore, implicitly models the questions which have been answered. As a result, it is always clear what information to prompt for next. In the mixedinitiative dialog community, keeping track of what has and has not been communicated is called dialog management.
The right sides of Tables 9 and 10 detail how a dialog specified using only one of each dialog type is staged by partial evaluation, which subsumes all of the other types based on the arguments with which you partially evaluate f. For instance, is achieved by progressively partially evaluating f with any prefix of its arguments. Similarly, {x (y z)} can be staged with partial evaluation as .
Given Table 7, we see why the enumerated specifications of dialogs on the right side of Table 4 are associated with the types on the left side of Table 4, and how dialogs conforming to those specifications can be staged (i.e., realized) in Tables 9 and 10. Tables 9 and 10 together naturally mirror Table 4.
4 Implementing Dialogs with Partial Evaluation
A specification of a dialog in this programming languages notation provides a plan for the implementation of the dialog. In this section we discuss the implementation details of automatically generating a dialog system from an enumerated specification of a dialog to be implemented (see Fig. 2). While the details of dialog mining (i.e., extracting a minimal specification in programming languages notation from an enumerated dialog specification; see transition from the left to the center of Fig. 2) are beyond the scope of this paper, and more appropriate for a data mining audience, we make some cursory remarks.
Comments
There are no comments yet.