Introduction
Preferences are an essential component to decision making and have been extensively studied in research communities such as decision theory, computational social choice, recommender systems, and knowledge representation. Various preference models have been proposed in the literature to represent preferences of different types including two major ones: quantitative models and qualitative models. Quantitative preference models integrate into the models numeric values used to define the preference relation of objects. These models include fuzzy constraint satisfaction models [18], penalty logic [7], and possibilistic logic [8]. On the other hand, qualitative preference models describe, either directly or indirectly, the relative ordinal relation of objects. Such models include lexicographic preference trees (LPtrees) [4, 14, 16], lexicographic preference forests (LPforests) [13, 17], conditional preference network (CPnets) [5], and answer set optimization [6]. These models draw our focus because they are proven to be intuitive, cognitively plausible, and predictive with high accuracy [2, 17].
In this paper, we focus on the learning problem of qualitative preference models, in particular, graphical models that are intuitive and often compact in size, such as LPtrees, LPforests and CPnets. Recently, active and passive learning of these graphical models have been studied, both theoretically and empirically, in the community [15, 17, 10, 1, 3]. However, these traditional preference learning works do not leverage the intuitivity and explainability of the models to interact with the decision maker in the learning process. Models explainable to human users are desirable when decision makers in various applications are to understand or even trust the resulting models formulated by intelligent machine partners [9].
To this end, we propose a novel framework that learns qualitative preference models. This framework is interactive and iterative: from the decision making user it obtains behavioral data, which is then preprocessed before sent to a preference learner that computes models (i.e., an LPtree, LPforest or CPnet) to be visualized and presented back to the user for feedback in order to improve the models in the following iterations. This learning process terminates when the user is satisfied with the learned models. In this report, we show our design and implementation of the framework, so far for learning LPtrees and LPforests, that is a web application using Django with Python and C++ as the programming languages on the server.
In the next section, we define and exemplify the two models: LPtrees and LPforests. Then, we present our humanintheloop preference learning framework, and demonstrate it by showing our prototype that learns LP models. Finally, we conclude, pointing to possible future research directions.
Lexcigraphic Preference Trees and Forests
The preference models we consider in this work are over alternatives from combinatorial domains of multivalued attributes. Now we define combinatorial domains and preference models we focus in our paper including lexicographic preference trees and forests. Let be a set of categorical attributes, each with a finite domain , where is bounded by a constant. The combinatorial domain over is the Cartesian product . Elements of combinatorial domains are called alternatives.
A lexicographic preference tree over is an ordered labeled tree, where (1) every nonleaf node is labeled by an attribute , and by a local preference , which is a total order over ; (2) every nonleaf node labeled by an attribute has outgoing edges, ordered from left to right according to ; (3) every leaf node is denoted by ; and (4) on every path from the root to a leaf each attribute appears at most once as a label. Each tree induces a total preorder that precisely is defined by the order of the leafs.
To illustrate, let us consider the domain of cars described by four attributes: BodyType () with values: minivan (), sedan (), and sport (); Make () with values Honda () and Ford (); Price () with values low (), medium (), and high (); and Transmission () with automatic () and manual (). An user’s preference order on cars from this space could be expressed by a tree in Figure 1.
Tree informs us that the most important attribute is BodyType with the user preferring minivans the most, then sedans and sports the least. Among minivans, the most important attribute is with medium preferred to low to high. Other nonleaf nodes in the tree are interpreted similarly. Leaf nodes, however, represent sets of cars with the instantiations of the attributes along their paths.
Given an alternative , we can traverse the tree and find its leaf. To compare two alternatives, we say that they are equivalent if they have same leaf. If they have different leaf nodes, the alternative in the preceding leaf is the preferred one. For instance, a Honda sedan is better than a Ford sedan, because the former ends up in leaf 3, preceding leaf 4, the leaf the latter car has.
Types of LPTrees
LPtrees, in general, can be of size exponential in the size of the combinatorial domain. However, trees with special structures can be collapsed to achieve compact representation. When the labeling attributes on all paths of the tree are exactly the same and the local preference orderings are the same on same attributes, this tree can be collapsed to a list of nodes labeled by attributes and unconditional preference orders. We call this type of LPtrees unconditional importance and unconditional preference LPtrees (UIUP LPtrees). Keeping this tree structure, if the local preferences are different on same attributes, these trees can also be collapsed, but to a list of nodes labeled by attributes and tables of conditional preference orders. We call this type of LPtrees unconditional importance and conditional preference LPtrees (UICP LPtrees). All the other LPtrees that are uncollapsible are called conditional importance and conditional preference LPtrees (CICP LPtrees), for the importance order of nodes depends on how their ancestors are instantiated in the tree. We show examples of these compact representations in Figure 1(a) and Figure 1(b). One example of a CICP tree is the one in Figure 1.
LPForests
An LPforest is a finite ensemble of LPtrees over combinatorial domains. To compare alternatives using an LPforest, researchers have proposed to apply a plethora of voting rules (such as Borda’s and Copeland’s rules) to aggregate the decisions of member trees [11, 17]. We propose to study the visualization problem of big forests of trees and how to effectively present the forest to the user. Clearly, visualizing and presenting the whole forest is infeasible. In this paper, our approach is to only present representative trees in the forest according to some distance measure, for which we consider Kendall’s distance. This measurement calculates the number of pairwise disagreements between two orderings. Clearly, it directly applies to computing distances between LPtrees, for LPtrees represent total orders, after possible equivalent alternatives are broken alphabetically. However, if we compute via computing their total orders first, the process may take time exponential in the size of the two trees if they are compactly represented of type UIUP or UICP. To alleviate this, we resort to polynomial algorithms proposed by Li and Kazimipour [12]. We implemented these algorithms to compute the distances for all pairs of trees in the forest. Then, the trees are clustered based on these distance values.
Framework
We now introduce our framework, shown in Figure 3, for interactive learning of qualitative preference models in the following. We call our framework ILPref for short. The goal of ILPref is to learn, and help the user to understand, her preferential decision making process over complex domains of options.
User
The user is the central decision maker, whom the framework tries to help understand her decision making process. The user provides behavioral data that can be either explicit or implicit. Explicit data are such as query answers and scaled ratings, whereas implicit data can be time or clicking distribution over a set of options reviewed by her. These are the source data our framework is learning the decision model from. Our implementation as is, shown in Figure 4, elicits behavioral data via binary queries asking the user to select the optional car the she likes more than the other.
Another type of data provided by the user is the feedback data, which are critiques based on the user’s input – visually explained model. Clearly, feedback data are not provided by the decision maker in the initial iteration, for no model is learned yet. In general, feedback data are for the learning algorithm to adjust the learned model accordingly. In our current implementation for UIUP LPtree models, the user can describe her feedback on the order of some attributes and the order of some attributes’ values. For instance, as shown in Figure 5, we see the learned UIUP tree presented. Based on it, the user provides the feedback that BuyingPrice should be more important than Persons. Also, she actually prefers medium to low on BuyingPrice, and big to medium on Luggage.
Preprocessor and Learner
The preprocessor takes the behavioral and feedback data and applies text mining techniques to formalize the data to be ready for the learning module. The learner then takes the domain description and examples and learns a model. Currently, we implemented the greedy heuristic for learning UIUP, UICP and CICP trees and forests of these trees
[13]. This implementation is augmented to handle feedback data from the user, in such a way that these data are treated as hard constraints.Visualizer
To present the learned model to the user, the visualizer draws the model and provides annotated description of it. Our prototype implements this module for visualizing LPtrees and LPforests (cf. Figure 5 for a UIUP tree model). An LPtree is drawn with expandable nodes to show or hide subtrees. It is up to only a few levels of the tree, as deeper attributes are less important in the model. When a forest of trees are learned, the framework only visualizes a very small number of representatives, selected by some clustering algorithm. Our prototype applies a singlelink clustering algorithm SLINK, by Sibson [19], based on pairwise Kendall’s distances between individual trees. Taking as input the distances between all pairs of trees, SLINK starts with clusters of single trees. It merges two clusters with the minimum distance between them, where the distance between clusters are defined as the average of distances between all pairs of trees in them. For example, we see the dendrogram of 13 UIUP trees in Figure 6, where the yaxis are the threshold values that are the numbers of disagreed examples between clusters of trees. To select the representatives using this dendrogram, we simply use a cutoff threshold value to partition the trees into buckets and select one model from each bucket.
Conclusion and Future Work
In this paper, we presented a novel humanintheloop framework to help the human user understand the decision making process of choosing from alternatives. We focused on alternatives from combinatorial domains and qualitative preference models that are intuitive and explainable, such as LPtrees, LPforests, and CPnets. This framework, which we call ILPref, is an interactive and iterative system. It visualizes the learned model to the user for feedback, which is taken to improve the model in the following iterations. To this end, we discussed the key aspects of our prototype system for learning the LP models.
For future work, we are interested in extending our prototype to enclose more preference models, e.g., CPnets. We also plan to perform a thorough user case study with human subjects to evaluate our system, and our selected decision models.
References
 [1] (2016) The complexity of learning acyclic cpnets.. In IJCAI, pp. 1361–1367. Cited by: Introduction.
 [2] (2015) Beyond theory and data in preference modeling: bringing humans into the loop. In International Conference on Algorithmic DecisionTheory, pp. 3–18. Cited by: Introduction.

[3]
(2017)
Learning treestructured cpnets with local search.
In
International Florida Artificial Intelligence Research Society Conference
, Cited by: Introduction.  [4] (2010) Learning conditionally lexicographic preference relations. In ECAI, pp. 269–274. Cited by: Introduction.
 [5] (2004) CPnets: a tool for representing and reasoning with conditional ceteris paribus preference statements. Journal of Artificial Intelligence Research 21, pp. 135–191. Cited by: Introduction.
 [6] (2003) Answer set optimization. In IJCAI, Vol. 3, pp. 867–872. Cited by: Introduction.
 [7] (1994) Penalty logic and its link with dempstershafer theory. In Uncertainty Proceedings, pp. 204–211. Cited by: Introduction.
 [8] (1994) Possibilistic logic 1. Cited by: Introduction.
 [9] (2017) Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web. Cited by: Introduction.
 [10] (2010) Learning conditional preference networks. Artificial Intelligence, pp. 685–703. Cited by: Introduction.
 [11] (2018) Voting on multiissue domains with conditionally lexicographic preferences. Artificial Intelligence. Cited by: LPForests.
 [12] (2018) An efficient algorithm to compute distance between lexicographic preference trees.. In IJCAI, pp. 1898–1904. Cited by: LPForests.
 [13] (2016) Learning partial lexicographic preference trees and forests over multivalued attributes. In Global Conference on Artificial Intelligence (GCAI), EPiC Series in Computing, Vol. 41, pp. 314–328. Cited by: Introduction, Preprocessor and Learner.
 [14] (2013) Aggregating conditionally lexicographic preferences using answer set programming solvers. In International Conference on Algorithmic Decision Theory, pp. 244–258. Cited by: Introduction.
 [15] (2015) Learning partial lexicographic preference trees over combinatorial domains. In AAAI Conference on Artificial Intelligence (AAAI), pp. 1539–1545. Cited by: Introduction.
 [16] (2015) Reasoning with preference trees over combinatorial domains. In International Conference on Algorithmic Decision Theory (ADT), pp. 19–34. Cited by: Introduction.
 [17] (2018) Preference learning and optimization for partial lexicographic preference forests over combinatorial domains. In International Symposium on Foundations of Information and Knowledge Systems (FoIKS), Cited by: Introduction, Introduction, LPForests.
 [18] (1992) Possibilistic constraint satisfaction problems or “how to handle soft constraints?”. In Uncertainty in Artificial Intelligence, 1992, pp. 268–275. Cited by: Introduction.
 [19] (1973) SLINK: an optimally efficient algorithm for the singlelink cluster method. The computer journal 16 (1), pp. 30–34. Cited by: Visualizer.