## 1. Introduction

Recommender systems have successfully enabled people and technology to interact in a seamless manner, and improved the social, economic, and environmental well-being in a wide range of domains such as transportation, disaster management, markets and e-health. These systems have become feasible due to many recent advancements in artificial intelligence (AI) and machine learning community and their application to various practical domains. However, many concerns (e.g. fairness/privacy

(ProPublica2016, ), lack of accountability (Doshi2017, ), lack of interpretability (Doshi2017b, )) have been looming around the credibility of such intelligent technologies due to two reasons: (i) the presence of inherent biases within the data collected from people, and (ii) our inability to comprehend the complexity in human decisions. Given that most researchers have been actively developing solutions to mitigate the first concern, we focus our attention on the second concern, particularly in the context of strategic persuasion in personalized recommender systems.The biggest challenge in modeling human decisions is that people exhibit a wide range of deviations (e.g. loss aversion, intransitive preferences, selective attention, and anchoring) from prescriptive models such as expected utility maximization, which are sometimes hard to characterize using a single mathematical framework. This results in a mismatched *mental model* within the artificial intelligence of personalized recommender systems, thus leading people to mistrust their recommendations and/or interventions. Furthermore, the actions of a few selfish organizations have recently tainted the notion of persuasion (Lawler2018, ), since human decisions have been influenced in a manner that reinforces their selfish desires (e.g. Cambridge Analytica manipulated a specific subset of voters to help swing election results). In spite of this turmoil, persuasion is not necessarily evil, and has been pervasive in our society (Antioch2013, ) for several centuries (e.g. refer to Aristotle’s Rhetoric on the art of persuasion and its applications in education and politics (Higgins2012, )). In fact, persuasive socio-technical interaction manifests whenever complex issues arise as in the case of several real-world applications such as transportation, disaster management systems, markets and e-health, which cannot be solved by mono-teams (where all the agents are of the same type).

Strategic information transmission has been actively pursued in the economics literature since early 1980s. Some notable examples include strategic information transmission (SIT) games in (Crawford1982, ), Bayesian persuasion in (Kamenica2011, ), information disclosure in (Rayo2010, ), and informational nudges in (Coffman2015, ). In the classical SIT framework in (Crawford1982, ), the sender-receiver mismatch was modeled using non-identical utilities where a single parameter influences only the sender’s utility (e.g. private information), and is independent of the receiver’s utility. Crawford and Sobel have studied Nash equilibria within SIT framework, and found that the sender employs quantization rules to encode the signal before sending it to the receiver. In contrast to Crawford and Sobel, *information disclosure* (Rayo2010, ) and *persuasion* (Kamenica2011, ) mechanisms have been proposed recently to analyze SIT framework in a Stackelberg setting (with sender as the leader, and receiver as the follower). Recently, this topic has also been studied actively by the information-theoretic community (Akyol2017b, ; Saritas2016, ; Nadendla2018, ) as well as researchers in computer science (Fogg2002, ; Babichenko2016, ; Xu2016, ; Dughmi2017, ; Das2017, ; Oudah2018, ).

In this paper, we develop a mathematical framework to model strategic interaction between the personalized recommender system and a human decision maker (user) as a Stackelberg signaling game, when both agents have non-identical (prior) belief distributions about the choice rewards. We assume that the personalized recommender system acts as a leader and reveals a signal to the human decision maker, who then makes a decision based on both the system’s shared belief and his/her own prior belief. For the sake of tractability, we assume that the rationality of human decision maker is based on expected utility maximization. In such a setting, our goal is to compute the equilibrium strategies at both the recommender system and the user, and investigate conditions under which (i) the recommender system reveals manipulated information, and (ii) user trust regarding the recommender system deteriorates, after the true rewards are realized.

Note that the strategic interaction investigated in this paper is fundamentally different to strategic interactions studied in the past literature. While aspects of persuasion between a sender and receiver are analyzed in (Kamenica2011, ), this work utilizes Bayesian updating as the method for constructing posterior beliefs of a sender and receiver while our approach constructs the posterior belief of the sender and receiver based on convex combination of prior and signalled beliefs. Further, (Kamenica2011, ) assumes a symmetric information setting between a sender and receiver, while our approach assumes the presence of information asymmetry since the sender’s prior belief is assumed as private information, which forces the receiver to rely on the signal revealed by the sender. Such a framework unfolds trust issues at the receiver, which has not been investigated in the past literature. Our findings regarding the evolution of trust between the sender and receiver can also be seen to coincide with people’s behavior in the real world.

## 2. Problem Setup

Consider a human-AI interaction setting as shown in Figure 1, where a person (Bob) is presented with a set of choices which is also known to AI (Alice). Although both Alice and Bob are not aware of the true nature of the choice rewards, we assume that both the agents can construct probabilistic beliefs regarding choice rewards based on their private information. If represents the state of the world, let and

denote the prior beliefs at Alice and Bob respectively defined on the probability simplex

on the reward space . In order to make this interaction sensible, we assume that Alice can have access to extrinsic private information which is typically acquired through some sensing infrastructure (e.g. sensor network, social sensing), in order to compute her posterior belief . This introduces information asymmetry in our problem setting, which motivates Bob to rely on Alice’s messages.Assuming that Alice has perfect knowledge about Bob’s belief , Alice constructs a new belief signal over the simplex based on Alice’s belief and shares it with Bob. Then, Bob combines his prior belief with the received information and constructs a posterior belief

(1) |

where [0, 1] is a parameter that captures Bob’s trust in Alice’s message. For example, if , then Bob starts trusting Alice blindly via disregarding his own prior belief regarding choice rewards. On the other extreme, if , then Bob starts distrusting Alice and makes decisions that are totally based on his own prior belief.

Let denote the probabilistic decision rule employed by Bob, where is the probability simplex on the choice set , and is the probability of picking the choice based on Bob’s ex-post belief . In such a case, Alice realizes an average reward

(2) |

where denotes the marginal reward distribution in and is the marginal expectation of .

On the other hand, Bob’s ex-post utility is given by

(3) |

where denotes the marginal reward distribution in , and is the marginal expectation of .

In this project, we model the strategic interaction between Alice and Bob as a Stackelberg game with Alice as the leader and Bob as the follower, as shown below:

(P1) |

where and are the strategy spaces at Alice and Bob respectively.

## 3. Equilibrium Analysis

### 3.1. Stage 1: Bob’s Best Response

Given Alice chooses a signaling strategy , Bob’s best response is to choose such that the expected utility at Bob

(4) |

is maximized.

For the sake of easy notation, let us denote

(5) |

for all . Then, the expected utility at Bob can be rewritten as

(6) |

In other words, the first optimization problem in (P1) (which Bob is interested to solve) reduces to

(P2) |

where

is a vector of

variables defined in Equation (5).###### Theorem 1 ().

For a given trust parameter , signaling strategy and prior , Bob’s best response (i.e. solution to Problem P2) is given by

(7) |

where .

###### Proof.

The Lagrangian function for Problem (P2) is given by

(8) |

The dual function for Problem (P2) is given by

(9) |

Note that, for all , the above dual function acts as a lower bound to the Lagrangian function in Equation (8), which itself acts as a lower bound to the objective function .

Therefore, the dual problem to Problem P2 is given as follows:

(P3) |

Without any loss of generality, Constraint 1 in Problem (P3) can be equivalently replaced with the statement

(10) |

Since the objective of Problem (P3) is equivalent to minimizing , the optimal choice of reduces to

(11) |

Since the duality gap in a linear program is zero, the optimal value of the primal problem in (

P2) is also equal to , which can be obtained with Bob’s best response shown in Equation (7). ∎### 3.2. Stage 2: Optimal Signaling at Alice

Alice’s optimal signal strategy is to choose the maximum entry in the vector such that

(12) |

is maximized. For the sake of easy notation, let us denote

(13) |

for all . Then, the expected utility at Alice can be rewritten as

(14) |

If we denote as a vector of variables defined in Equation (13), the second optimization problem in (P1), which Alice wishes to solve, reduces to

(P4) |

###### Theorem 2 ().

The optimal signaling strategy at Alice is to choose a distribution such that

(15) |

holds true for all , where .

###### Proof.

Note that depends on as shown in Equation (7), which in turn depends only on the expectation of , and not on the distribution itself. In other words, it is sufficient for Alice to share the average choice rewards to Bob, instead of sharing the distribution . Therefore, we henceforth assume that Alice only shares average rewards to Bob.

If we denote average rewards as , then Problem P5 reduces to the following:

(P5) |

Since Bob chooses according to Equation (7), it is natural to identify and maximize via choosing such that

(16) |

holds true for all .

∎

###### Corollary 1 (to Theorem 2).

Revealing the average rewards as opposed to signaling the distribution does not have any effect on the utilities at both Alice and Bob.

## 4. Trust Analysis

Before we analyze the effects of trust parameter on Alice’s signaling, we first define strategic manipulation formally in the following definition.

###### Definition 1 ().

Alice employs strategic manipulation if she chooses .

In other words, if we replace in the Alice’s optimal signaling condition in Equation (15), we can find settings in which Alice employs strategic manipulation. We state this condition formally in the following corollary.

###### Corollary 2 ().

(to Theorem 2) Alice adopts strategic manipulation if there exists at least one such that the following condition holds true.

(17) |

where .

Note that, when , Condition (17) does not hold true, since . In other words, when Bob trusts Alice, Alice naturally has the incentive to reveal truthful information to Bob. We state this result formally in the following corollary.

###### Corollary 3 ().

(to Theorem 2) When , Alice has no incentive to share manipulated information to Bob.

Although Alice may reveal truthful information to Bob, he cannot observe if Alice’s signaling strategy is congruent with her prior belief . This lack of information regarding Alice’s prior belief can lead to distrust at Bob regarding Alice, especially when Bob does not obtain the desired outcomes. Therefore, in this paper, we model Bob’s trust dynamics in the following manner:

(18) |

where is the default step size, and is Bob’s regret for not obtaining the utility , which he would have obtained if he did not interact with Alice. In other words, can be computed by substituting in Equation (7).

Given Alice’s strategy , let

(19) |

denote the choice picked by Bob after interacting with Alice, as pointed in Theorem 1. In such a case, Bob’s regret can be computed as

(20) |

where is the choice that Bob would have picked if he did not interact with Alice. Obviously, if Bob does not participate in this interaction, then and his regret will continue to remain zero. Therefore, we henceforth ignore this trivial case, and always assume that Bob interacts with Alice with .

As our final result, we show that the dynamical model defined in Equations (18) and (20) can potentially deteriorate Bob’s trust even though Alice is revealing information truthfully.

###### Claim 1 ().

Bob’s trust deteriorates even though Alice reveals truthful signals, whenever

(21) |

###### Proof.

If Alice is truthful, then . Therefore, we replace with in the definition of , and obtain Bob’s regret under the condition when Alice reveals its prior information truthfully regarding the choice rewards. Note that Bob’s trust deteriorates whenever . Rearranging the terms, we obtain the claimed result. ∎

## 5. Conclusion and Future Work

In this project, we modeled the strategic interaction between a recommender system, Alice, and a human agent, Bob, as a Stackelberg game with Alice as the leader and Bob as the follower. We computed the equilibrium strategy at Bob, and presented sufficient conditions for Alice’s equilibrium strategy. By analyzing dynamics of Bob’s trust, we found that Alice will employ strategic manipulation if her signalling strategy does not match her posterior belief, but Alice has no incentive to employ strategic manipulation if Bob does not distrust her. We also showed that Bob’s trust may deteriorate even when Alice is revealing information truthfully, if he does not obtain a desired outcome after interacting with Alice. In our future work, we will extend these results by considering the evolution of Bob’s trust towards Alice when Bob’s rationality are characterized by human decision models as opposed to expected utility maximization.

## References

- (1) Akyol, E., Langbort, C., and Başar, T. Information-Theoretic Approach to Strategic Communication as a Hierarchical Game. Proceedings of the IEEE 105, 2 (Feb 2017), 205–218.
- (2) Antioch, G. Persuasion is Now 30 Percent of US GDP: Revisiting McCloskey and Klamer after a Quarter of a Century. Economic Round-up, 1 (2013), 1.
- (3) Babichenko, Y., and Barman, S. Computational Aspects of Private Bayesian Persuasion. ArXiv Preprint: 1603.01444 (2016).
- (4) Coffman, L., Featherstone, C. R., and Kessler, J. B. A Model of Information Nudges. Working Paper, 2015.
- (5) Crawford, V. P., and Sobel, J. Strategic Information Transmission. Econometrica (1982), 1431–1451.
- (6) Das, S., Kamenica, E., and Mirka, R. Reducing Congestion through Information Design. In 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (2017), IEEE, pp. 1279–1284.
- (7) Doshi-Velez, F., and Kim, B. Towards a Rigorous Science of Interpretable Machine Learning. ArXiv Preprint:1702.08608 (2017).
- (8) Doshi-Velez, F., Kortz, M., Budish, R., Bavitz, C., Gershman, S., O’Brien, D., Schieber, S., Waldo, J., Weinberger, D., and Wood, A. Accountability of AI under the Law: The Role of Explanation. ArXiv Preprint:1711.01134 (2017).
- (9) Dughmi, S. Algorithmic Information Structure Design: A Survey. ACM SIGecom Exchanges 15, 2 (2017), 2–24.
- (10) Fogg, B. J. Persuasive Technology: Using Computers to Change What We Think and Do. Ubiquity 2002, December (2002), 5.
- (11) Higgins, C., and Walker, R. Ethos, Logos, Pathos: Strategies of Persuasion in Social/Environmental Reports. Accounting Forum 36, 3 (2012), 194 – 208.
- (12) Julia Angwin, Jeff Larson, S. M., and Kirchner, L. Machine Bias. ProPublica (May 2016).
- (13) Kamenica, E., and Gentzkow, M. Bayesian Persuasion. American Economic Review 101, 6 (October 2011), 2590–2615.
- (14) Lawler, M., Morris, A. D., Sullivan, R., Birney, E., Middleton, A., Makaroff, L., Knoppers, B. M., Horgan, D., and Eggermont, A. A Roadmap for Restoring Trust in Big Data. The Lancet Oncology 19, 8 (2018), 1014–1015.
- (15) Nadendla, V. S. S., Langbort, C., and Başar, T. Effects of Subjective Biases on Strategic Information Transmission. IEEE Transactions on Communications 66, 12 (Dec 2018), 6040–6049.
- (16) Oudah, M., Rahwan, T., Crandall, T., and Crandall, J. W. How AI Wins Friends and Influences People in Repeated Games with Cheap Talk? In Thirty-Second AAAI Conference on Artificial Intelligence (2018).
- (17) Rayo, L., and Segal, I. Optimal Information Disclosure. Journal of Political Economy 118, 5 (2010), 949–987.
- (18) Sarıtaş, S., Yüksel, S., and Gezici, S. Quadratic Multi-Dimensional Signaling Games and Affine Equilibria. IEEE Transactions on Automatic Control 62, 2 (2016), 605–619.
- (19) Xu, H., Freeman, R., Conitzer, V., Dughmi, S., and Tambe, M. Signaling in Bayesian Stackelberg Games. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems (2016), International Foundation for Autonomous Agents and Multiagent Systems, pp. 150–158.

Comments

There are no comments yet.