The power of moral words: Understanding framing effects in extreme Dictator games using sentiment analysis and moral judgments

01/08/2019 ∙ by Valerio Capraro, et al. ∙ Middlesex University London Heriot-Watt University 0

Recent work shows that people are not solely motivated by the economic consequences of the available actions, but they also have moral preferences for `doing the right thing', independently of its economic consequences. Here we add to this literature with two experiments. In Study 1 (N=567) we implement an extreme dictator game in which dictators either get 0.50 and another person gets nothing, or the other way around (i.e., the other person gets 0.50 and the dictator gets nothing). We experimentally manipulate the words describing the available actions using six words, from very negative (e.g., stealing) to very positive (e.g., donating) connotations. Our hypothesis is that people are reluctant to make actions described using words with a negative connotation, and are eager to make actions described using words with a positive connotation, independently of their economic consequences. As predicted, we find that the connotation of the word has a U-shaped effect on pro-sociality. Moreover, we show that the overall pattern of results, the U-shape, but not its details, can be explained using a technique from Computational Linguistics known as Sentiment Analysis. In Study 2 (N=413, pre-registered) we make a step forward and we collect the self-reported moral judgment and feeling associated to each of the six words used in Study 1. We show that the rate of pro-sociality in Study 1 can be predicted from the moral judgments and the feelings in Study 2 via Krupka & Weber's utility function. In sum, our findings provide additional evidence for the existence and relevance of moral preferences, confirm the usefulness of Krupka & Weber's utility function, and suggest that building bridges from computational linguistics to behavioral science can contribute to our understanding of human decision making.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 10

page 14

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Decades of experimental research have shown that people are not solely motivated by a desire to maximize their own material payoff, but they often pay costs to help others, even in one-shot anonymous interactions, when no direct nor indirect rewards seem to be at play camerer2011behavioral; fehr2006economics. This “pro-social” attitude has been fundamental to the evolution of our societies, and is now more important than ever, as we face challenges that require large scale cooperation, such as resource depletion, climate change, and spread of misinformation ariely2009doing; brady2017emotion; capraro2018grand; crockett2017moral; gintis2003explaining; gneezy2014avoiding; karlan2007does; kraft2015promoting; milinski2006stabilizing; pennycook2018lazy; perc2017statistical; rand2013human; tomasello2005search.

A large amount of theoretical work has tried to explain pro-social behavior in one-shot economic interactions. Most notably, theories of social preferences assume that people do not care only about their own payoff, but also, to some extent, about the payoff of others. For example, inequity aversion models presume that people care also about minimizing payoff differences fehr1999theory; bolton2000erc; welfare maximization models argue that people have a tendency to maximize the sum of all payoffs charness2002understanding; engelmann2004inequality; and reciprocity models –- developed for sequential games –- affirm that people have a propensity to reciprocate good actions with good actions, and bad actions with bad actions dufwenberg2004theory; falk2006theory. Another stream of research proposes different solution concepts to explain deviations from self-interest, as, for example, the concept of iterated regret minimization halpern2012iterated, the concept of cooperative equilibrium capraro2013model, and several others capraro2015translucent; halpern2010cooperative; renou2010minimax. These models assume that people apply a decision rule other than maximizing utility. Although these decision rules differ among one another on many respects, they share one common property: under the assumption that the utility is a function of the monetary payoffs, their predictions depend only on the economic consequences of the available actions.

A different perspective has been emerging in the last decade, as behavioral scientists have come to realize that people are not solely motivated by the economic consequences of the available actions, but also from doing the morally right thing. For example, empirical evidence has been presented to support the hypotheses that (i) altruistic behavior in the Dictator game and cooperative behavior in the Prisoner’s dilemma are partly driven by a desire to do the “nice” thing capraro2018right; tappin2018doing, (ii) differences in behavior in the Dictator game in the “give” frame vs the “take” frame are driven by a change in people’s perception about what is the appropriate thing to do krupka2013identifying, (iii) a change in the perception of what is the morally wrong thing to do can generate framing effects among Ultimatum game responders eriksson2017costly, (iv) moral reminders increase pro-social behavior in both the Dictator game and the Prisoner’s dilemma branas2007promoting; capraro2017right; dal2014right. A number of theoretical models have also been introduced to formalize people’s tendency towards doing the right thing alger2013homo; brekke2003economic; dellavigna2012testing; huck2012social; kessler2012norms; kimbrough2016norms; krupka2013identifying; lazear2012sorting; levitt2007laboratory.

However, it remains unclear in what situations these moral preferences can be activated by simply changing the framing of a decision problem. Capraro and Rand capraro2018right and Tappin and Capraro tappin2018doing found that moral preferences can drive framing effects in the Trade-Off game, a decision problem in which decision-makers have to decide between an equitable and an efficient allocation of resources. Eriksson et al eriksson2017costly found that moral preferences can drive framing effects among Ultimatum game responders. Particularly interesting is the situation regarding the Dictator game, since previous research has led to mixed results. Earlier works have mainly focused on whether the Dictator Game in the Take frame gives rise to greater pro-sociality than the same game in the Give frame. Krupka and Weber krupka2013identifying found that, indeed, participants tend to be more pro-social in the Take frame than in the Give frame, and that this framing effect is driven by a change in the perception of what is the “socially appropriate thing to do”. However, the framing effect when passing from the Take frame to the Give frame was not replicated by two other works dreber2013people; goerg2017framing, casting doubts on the very existence of framing effects in the Dictator game.

Here we shed light on this point with two experiments. In Study 1, we show that it suffices to change only one word in the instructions of an (extreme) Dictator game to significantly impact people’s decisions. The intuition is that people tend to be reluctant to make actions with a negative connotation (e.g., stealing), and are eager to make actions with a positive connotation (e.g., donating), independently of the economic consequences that these actions bring about. In Study 2 (pre-registered), we recruit a brand new sample that we use to collect self-reported moral judgments and feelings associated to each of the words used in Study 1; and we show that these measures can be used to explain framing effects in Study 1.

2 Study 1

The goal of Study 1 is to show that, at least in some situations, it is enough to change only one word in the instructions of a decision problem, to bias people’s behavior. To show this point, we consider a variant of the classical Dictator game. In fact, in the standard version of the Dictator game, one player (dictator) has to allocate an amount of money (e.g., $0.50) between her/himself and another participant (typically anonymous). The recipient has no choice and only gets what the dictator decides to give. Conversely, in our extreme version, the dictator can opt only for the two extreme options: either s/he gets the whole $0.50 (and the other person gets nothing) or the other way around, i.e., the other person gets the $0.50 and the decision maker gets nothing. The reason why we choose this extreme variant, rather than the classical one, is to put ourselves in the condition of being able to write the instructions of the decision problem in several different treatments by changing only one word. This would have been hard, if not impossible, with the classical “continuous” variant, as it will be clear later. In this extreme Dictator game, we prove the hypothesized dependency between language and economic decisions, by manipulating the words being used to describe the two available actions. The intuition is that, under the same economic conditions, naming the self-regarding action through an extremely “negative” expression as Steal from the other participant, rather than using the more “neutral” expression Take from the other participant, will impact the final decision. In fact, even though the decision makers are not actually stealing the money, they would be reluctant to make the corresponding action, because of the latent connection between the word steal and the social norms they usually obey to.

We formalize this idea by relying on Sentiment Analysis (SA; pang2004sentimental; pang2002thumbs) models and resources. SA aims at studying the latent sentiment expressed through language. It has been applied to predict sales performance liu2007low, rank products and merchants mcglohon2010star, and extract the sentiment from short texts vanzo2014context

. Another task of the SA research concerns the automatic creation of linguistic resources providing the extent of the sentiment evoked by words in a language, through the analysis of corpora of documents. In these lexicons, sentiment is provided for each word as a real-valued polarity degree, discretized into different classes depending on the model adopted (e.g.,

positive, negative, neutral, conflicting, …). For example, the negative dimension of the verb steal will assume a consistently higher value than its positive counterpart. It is thus clear that such resources may represent a valuable solution for providing an initial guess about the wording to adopt for pursuing our purpose.

In order to have a fine-grained representation of the sentiment polarity distribution over the words to apply, we thus select six words to be used in our extreme Dictator game, ranging over different connotations, from very negative to very positive. The words are extracted from the lexical database SentiWordNet baccianella2010sentiwordnet; esuli2006sentiwordnet, a resource built upon the well-known WordNet miller1995wordnet. For each synset (i.e., a set of words sharing the same meaning, synonyms), this resource provides the corresponding sentiment, distributed over three scores. Each score refers to the extent of positivity, negativity, and neutrality evoked by the synset. Due to the inherent ambiguity of the language, the sentiment polarity of a given word can assume different scores depending on the synset it refers to. For example, according to SentiWordNet, the word estimable

, when interpreted as “computed or estimated”, is represented with a sentiment polarity where the

positive and negative scores assume value 0, while the neutral score is 1. Conversely, when the very same word is interpreted as “deserving of respect or high regard”, the corresponding positive score is 0.75, while the negative and neutral scores change to 0 and 0.25, respectively. The sum of the positivity, the negativity, and the neutrality scores is always equal to 1.

For our purpose, we draw six words from SentiWordNet, belonging to the following synsets: steal#1 = “take without the owner’s consent”; take#8 = “take into one’s possession”; demand#1 = “request urgently and forcefully”; give#3 = “transfer possession of something concrete or abstract to somebody”; donate#1 = “give to a charity or good cause”; and boost#2 = “be beneficial to”. In Table 1, we report the positivity, the negativity, and the neutrality scores associated to these synsets.

Word Synset Positivity Negativity Neutrality
steal #1 0 0.500 0.500
take #8 0 0 1.000
demand #1 0 0.25 0.750
give #3 0 0 1.000
donate #1 0.625 0 0.375
boost #2 0.250 0 0.750
Table 1: The words used in Study 1, together with the synsets in which we consider them (described in the text) and their sentiment polarities. Scores have been drawn from SentiWordNet.

Of course, we use SentiWordNet as a starting point to select words that promise to generate measurable framing effects. One might in fact argue that the word Take, in the context of the (extreme) Dictator Game, will be perceived to be negative and not neutral, as reported by SentiWordNet, while the word Give will be perceived to be positive; and that perhaps the word Boost would actually be perceived closer to full neutrality, since it is very rarely used in the English language in this specific context. This is indeed partly true. We will come back to this point in Study 2.

2.1 Experimental Design

Participants are randomly assigned to one out of twelve conditions. In the Steal vs Don’t steal condition, they are told that there are $0.50 available and they have to choose between two possible actions: Steal from the other participant, so that they would get the $0.50 and the other participant would get nothing; or Don’t steal from the other participant, so that the other participant would get the $0.50, while they would get nothing. It is made explicit that the other participant has no choice and will be really paid according to the decision made. Participants are also asked two comprehension questions, to make sure they understand the decision problem. One question asks which choice would maximize their payoff; the other one asks which choice would maximize the other participant’s payoff. Participants failing either or both comprehension questions are automatically excluded from the survey. Those who pass the comprehension questions are asked to make the real choice. The Don’t steal vs Steal condition is similar. The only difference is that we switched the order of presentation of the options, in order to avoid order effects. The conditions Take vs Don’t take, Don’t take vs Take, Demand vs Don’t demand, Don’t demand vs Demand, Give vs Don’t give, Don’t give vs Give, Donate vs Don’t donate, Don’t donate vs Donate, Boost vs Don’t boost, Don’t boost vs Boost

are analogous. The instructions of one frame differ from those of another frame only in one word (see Appendix for full instructions). This justifies why we chose the extreme variant of the Dictator game, rather than the classical “continuous” variant. A moment of reflection indeed shows that it is very hard, if not impossible, to write, for example, the

Take frame and the Give frame in a “continuous” form using instructions that differ only on one word.

A standard demographic questionnaire concludes the survey. At the end of the questionnaire, participants are communicated the completion code with which they can submit the survey to Amazon Mechanical Turk (AMT) and claim their payment. Payoffs are computed and paid on top of their participation fee ($0.50). The other participant was selected at random from the same sample. Therefore, participants received a payment both from their choice as decision makers, and from being in the role of the “other participant” for a different participant (when making their choice, participants were not informed that they would also be in the role of the “other participant” to avoid having this affect their decision). We refer to the Appendix for verbatim experimental instructions.

2.2 Results

2.2.1 Participants

We recruited US based participants on the online platform Amazon Mechanical Turk (AMT). AMT studies are cheap and easy to implement, as subjects can participate remotely to an online survey, taking no more than a few minutes. This allows researchers to significantly decrease the costs of the participation fee and the stakes of the experiment, without invalidating the data. Indeed, several works have shown that data gathered using AMT are of no less quality than data collected using the standard laboratory arechar2018conducting; branas2018gender; horton2011online; paolacci2014inside; paolacci10runningexperiments; rand2012promise. We collected observations and cleaned this dataset through the following two operations. First, we looked for multiple observations by checking multiple IP addresses and multiple Turk IDs. In case we found multiple observations, we kept only the first one (as determined by the starting date) and we deleted all the remaining ones. Second, we discarded all participants who failed either or both comprehension questions. After these operations, we remained with a sample of participants ( females, mean age = years). Thus, in total, of the participants have been eliminated from the analysis. This is in line with previously published studies using AMT horton2011online.

2.2.2 Pro-Sociality

To analyze this sample, we first build a binary variable, named

Pro-Sociality, which takes value whenever a participant allocates the $0.50 to the other person, and whenever a participant allocates the $0.50 to him/herself. Average Pro-Sociality across the whole sample is , that is, of the participants act pro-socially. Pro-Sociality within a frame turns out to not depend on the order in which the options are presented, e.g., average Pro-Sociality in the condition Steal vs Don’t steal is not statistically different from average Pro-Sociality in the condition Don’t steal vs Steal. Similarly, within all other frames. Thus, in what follows, we collapse across order and we name the union between condition w vs Don’t w and Don’t w vs w.

2.2.3 Framing effect

Then, we look at the effect of each single frame on Pro-Sociality. Figure 1 provides visual evidence that there is significant variability across conditions: the minimum of the variable Pro-Sociality is attained in the Boost frame (), while its maximum is attained in the Steal frame (). Table 2

reports coefficients and standard errors of pairwise logit regressions. In short, it emerges that the

Steal frame produces an amount of pro-sociality that is statistically significantly higher than that of any other frames. Conversely, the Boost frame gives rise to an amount of pro-sociality that is numerically lower than that of any other frames; however, the difference is statistically significant only versus the Demand, the Take, and the Steal frames, while it is marginally significant versus the Donate frame. This is reflected also on the motivations left by participants at the end of the survey to explain their decision. In the Steal frame, explanations such as “Stealing is wrong” and “I chose not to steal because I would feel like a terrible person if I had. I think it is morally wrong, even if it is 50 cents” are common; conversely, in the Boost case, examples of explanations are “I wanted to earn the money and I felt that not boosting the other person wasn’t that bad since I wasn’t really stealing, just not boosting their earning” or “Since there are no consequences, why not keep the money. Why would I choose to ‘BOOST’ a stranger, the stranger could be not a nice person. Even though it would be essentially the same thing, it would be harder to ‘steal’ the money than to ‘not boost”’. The other frames numerically lie between the Boost frame and the Steal frame, but pairwise differences are not statistically significant.

Figure 1: Pro-Sociality across conditions in Study 1. Error bars represent standard errors of the means.
Give Donate Demand Take Steal
Boost -0.607 -0.937* -1.181** -1.234** -2.065***
(0.590) (0.560) (0.548) (0.529) (0.515)
Give -0.331 -0.575 -0.628 -1.458***
(0.490) (0.477) (0.455) (0.438)
Donate -0.244 -0.297 -1.128***
(0.440) (0.416) (0.397)
Demand -0.053 -0.884**
(0.400) (0.381)
Take -0.830**
(0.352)
Table 2: Pairwise logit regression predicting the effect on Pro-Sociality of passing from one frame of Study 1 to the other one. We report coefficients and, in brackets, standard errors. Significance thresholds: *: p < 0.1, **: p < 0.05, ***: p < 0.01.

2.2.4 The framing effect is not driven by a potential endowment effect

We conclude the analysis of Study 1 with an observation regarding a potential endowment effect. Classical studies have shown that people ascribe more values to things that they own relative to the same things when possessed by someone else, the so-called endowment effect kahneman1990experimental. This might have affected our results along the following channel: although our experimental instructions make it clear that the money is not initially given to any of the players, it is possible that the very use of words such as give, donate, and boost generates, among dictators, a feeling of endowment, that is not present when words such as demand, take, and steal are being used.

However, Table 1 shows that the framing effect is not driven by a potential endowment effect, because the framing effect remains present even if we restrict the analysis only to negative words (the steal frame gives rise to significantly more pro-sociality than the take and the demand frames) and, to a lesser extent, to positive words (the donate frame gives rise to marginally significantly more pro-sociality than the boost frame).

3 Study 2

Study 1 shows that changing only one word in the instructions of an extreme Dictator game can significantly alter people’s decisions. The intuition is that people are reluctant to make actions described using negative words and are eager to make actions described positive words. Positivity and negativity of a word were defined using the SentiWordNet database. However, although SentiWordNet turns out to be a useful starting point to select words that promise to generate framing effects, one soon notices that the details of the framing effect remain not well explained. For example, according to SentiWordNet, the word Take is neutral, while the word Boost is positive. Consequently, every utility function that is monotone in the variable SentiWordNet predicts a higher rate of pro-sociality in the Boost condition than in the Take condition, which is exactly the opposite of what Study 1 finds.

In retrospection this is not surprising, because SentiWordNet is based on the notion of synset (i.e., set of synonyms), whose words share the same semantics. At the same time, the very same lexical surface can evoke different semantics (i.e., a word that belong to different synsets). A straightforward example, as we mentioned earlier, is the word “estimable”, that can assume different meanings depending on the context in which it has been uttered. As a reflection, it thus might be the case that the positivity and negativity scores of a word considered within a SentiWordNet synset do not exactly correspond to those perceived by people participating in our economic experiment, where the corresponding words are used in a given, specific context.

To clarify this point, in Study 2 we collect context-dependent sentiment polarities by using a questionnaire. The goal is to have a better description of the results of Study 1, from both a quantitative and a conceptual viewpoints. Indeed, collecting context-dependent sentiment polarities raises a non-trivial question: How can we measure these polarities? In this work, we decided to focus on two dimensions which we believed they were likely to be important. One dimension, motivated by the aforementioned literature on moral preferences, is the moral judgment associated to each of the six words used in Study 1. The other one, motivated by the literature on sentiment analysis, is the emotional reaction associated to each of the same six words111Note that we are neither stating that these dimensions are orthogonal, nor that they are complete.

3.1 Experimental design

Participants are randomly divided between two conditions: the Moral judgment condition and the Feeling condition. Participants in the Moral judgment condition are presented, in random order, the instructions of all the six frames of the extreme Dictator game in Study 1. For each frame word , e.g. , after reading the instructions of the corresponding extreme Dictator game, participants are asked the following two questions in random order:

From a moral point of view, how would you judge the choice: to steal?

From a moral point of view, how would you judge the choice: not to steal?

Answers are collected using a 5-point Likert scale with: 1 = “extremely wrong”, 2 = “somewhat wrong”, 3 = “neutral”, 4 = “somewhat right”, and 5 = “extremely right”. After answering all these twelve (two for each frame) questions, participants enter the demographic questionnaire ending the survey. Note that, in this study, participants do not make any decision.

The Feeling condition is similar to the Moral judgment condition, apart from the fact that participants, instead of the moral judgment questions reported above, are asked the following questions:

How would you describe your feeling if you choose: to steal?

How would you describe your feeling if you choose: not to steal?

Answers are collected using a 5-point Likert scale with: 1 = “extremely negative”, 2 = “somewhat negative”, 3 = “neutral”, 4 = “somewhat positive”, and 5 = “extremely positive”.

This design was pre-registered, as part of a larger study, in https://aspredicted.org/3nf3b.pdf.

3.2 Results

3.2.1 Participants

We recruited 506 US based participants on AMT. None of them participated in Study 1. After eliminating multiple IP addresses and multiple TurkIDs, we remained with 413 observations (female = 40.8%, mean age = 34.6)222

We pre-registered a sample size of 300 participants. However, since this was a non-incentivized survey, at the end of the survey we did not give participants a completion code to claim for their payment (because all participants were paid the same amount). This probably confounded some participants who ended the survey (and for whom we have the data because they are automatically saved on Qualtrics) but did not submit it to AMT because afraid of getting rejected (and so they were not counted by the AMT counter of the participants). This implies that we had more participants than planned. Note that this does not invalidate the data because the completion code is at the end of the survey.

.

3.2.2 Comparison between context-dependent moral judgments and feelings with SentiWordNet polarities

Before testing our main hypotheses, as a first step of our analysis, we compare the self-reported context-dependent moral judgments and feelings with the sentiment polarities from SentiWordNet. Table 3 reports the values of and for each of the six frames and compare them with the SentiWordNet sentiment polarities. We recall that we collected moral judgments and feelings using a 5-point Likert-scale, where 3 = “Neutral”. Thus, to favor interpretability (and comparability with SentiWordNet polarities), in Table 3 we normalize these measures between -1 and 1, where 0 corresponds to neutral.

Steal -0.5 -0.43 -0.18
Demand -0.25 -0.17 -0.05
Take 0 -0.25 0.02
Give 0 0.36 0.11
Boost 0.25 0.28 0.07
Donate 0.625 0.41 0.15
Table 3: Self-reported context-dependent moral judgments and feelings collected in Study 2, compared with SentiWordNet sentiment polarities.

We can draw some observations, starting with Judgm. Steal is the most negative word according to both SentiWordNet and Judgm. Moreover, also the magnitudes are very similar (-0.5 according to SentiWordNet; -0.43 according to Judgm). Conversely, Donate is the most positive word according to both measures, although quantitatively different (0.625 according to SentiWordNet; 0.41 according to Judgm). These two measures give similar results also for Demand (-0.25 according to SentiWordNet; -0.17 according to Judgm) and Boost (0.25 according to SentiWordNet; 0.28 according to Judgm). The difference between Judgm and SentiWordNet is more evident if we look at the words Give and Take, which appear to be neutral according to SentiWordNet, but not according to Judgm. In fact, the former turns out to be positive (Judgm(Give) = 0.36), while the latter negative (Judgm(Take) = -0.25). It is thus clear that the context has a non-trivial effect on the polarity of the word. The variable Feel behaves similarly to Judgm (magnitudes are ordered in the same way, except for Demand and Take that are switched). However, the magnitudes tend to be closer to zero.

3.2.3 Predicting pro-sociality from self-reported moral judgments

Here we explore whether the moral judgments collected in Study 2 can be used to explain pro-sociality in Study 1.

To do so, we define the following sets. Let denote the set of wordings corresponding to the pro-social action. Similarly, let us also define the set , to be the set of wordings corresponding to the pro-self action. For each , let be the corresponding word in , that is, for example, .

We test whether pro-sociality in Study 1 can be predicted by the difference .

To this end, we first compute, from the dataset of Study 2, all the values , for . In doing so, we obtain , , , , , .

First of all, we note that logit regression finds an overall significant effect of the variable Judgm(w)-Judgm on Pro-Sociality (coeff = 2.218, z = 4.867, p ).

However, an overall effect does not automatically imply that the details of the effect are well explained. To shed light on this point, we next explore the effect of Judgm(w)-Judgm on Pro-Sociality for every pair of frames. Table 4 reports coefficients and standard errors of pairwise logit regressions predicting Pro-Sociality as a function of for every pair of frames. Comparing Table 4 with Table 2, we observe that all the significance levels are exactly the same, suggesting that indeed is a good predictor of Pro-Sociality, even at a point-wise level. To strengthen this conclusion, we observe that the coefficients in Table 4 are all similar in magnitude, as shown by meta-analysis over the coefficients and standard errors, that finds an overall effect size of , statistically significant (), and, crucially, no heterogeneity in the coefficients () — Note that this coefficient obtained via meta-analysis of the logit coefficients (2.238) is very similar to the overall coefficient obtained via logit regression (2.218). This suggests that, in our experiment, the same coefficient for the variable can explain the rate of pro-sociality in all the frames of Study 1.

Give Donate Demand Take Steal
Boost 5.514 5.858* 4.219** 2.805** 2.681***
(5.362) (3.502) (1.960) (1.203) (0.669)
Give 6.614 3.381 1.902 2.209***
(9.807) (2.805) (1.378) (0.663)
Donate 2.033 1.061 1.848***
(3.665) (1.484) (0.651)
Demand 0.332 1.803**
(2.498) (0.777)
Take 2.516**
(1.068)
Table 4: Pairwise logit regression predicting the effect of on Pro-Sociality, for every pair of frames. We report coefficients and, in brackets, standard errors. Significance thresholds: *: p < 0.1, **: p < 0.05, ***: p < 0.01.

3.2.4 Predicting pro-sociality from self-reported feelings

Here we explore whether the feelings collected in Study 2 can be used to explain pro-sociality in Study 1.

Following a similar procedure as in Section 3.2.3, we first note that logit regression finds an overall significant effect of the variable Feel(w)-Feel on Pro-Sociality (coeff = 4.253, z = 4.659, p ).

As before, an overall effect does not automatically imply that the details of the effect are well explained. Therefore, to shed light on this point, we next explore the effect of Feel(w)-Feel on Pro-Sociality for every pair of frames. Table 5 reports coefficients and standard errors of pairwise logit regression predicting Pro-Sociality as a function of for every pair of frames. Again, we find that, comparing Table 5 with Table 2, all the significant levels are the same. Furthermore, the coefficients are all similar in magnitude, as shown by meta-analysis over the coefficients and standard errors, which reveals an overall effect of , statistically significant (, ), and, crucially, no evidence of heterogeneity in the coefficients (). — Note that also in this case the coefficient obtained via meta-analysis of the logit coefficients (4.312) is very similar to the overall coefficient obtained via logit regression (4.253). This suggests that, in our experiment, the same coefficient for the variable can explain the rate of pro-sociality in all the frames of Study 1.

Give Donate Demand Take Steal
Boost 7.582 6.695* 7.383** 17.635** 5.433***
(7.372) (4.002) (3.428) (7.562) (1.355)
Give 5.512 7.184 -62.789 4.861***
(8.172) (5.961) (45.472) (1.460)
Donate 12.201 -4.245 4.698***
(21.995) (5.939) (1.655)
Demand 0.591 4.016**
(4.442) (1.723)
Take 2.679**
(1.137)
Table 5: Pairwise logit regression predicting the effect of on Pro-Sociality for every pair of frames. We report coefficients and, in brackets, standard errors. Significance thresholds: *: p < 0.1, **: p < 0.05, ***: p < 0.01.

4 Discussion

In Study 1 we have shown that it suffices to change only one word in the instructions of an extreme Dictator game to significantly alter people’s behavior. The intuition behind this finding is that some people are reluctant to make actions labelled using words with a negative connotation (e.g., stealing) and are eager to make actions labelled using words with a positive connotation (e.g., donating), independently of the economic consequences that these actions bring about. In Study 1 we defined positive and negative connotations using SentiWordNet polarities. The results, however, highlighted that SentiWordNet polarities do not accurately predict the details of the effect of changing one word on people’s pro-sociality. For example, SentiWordNet polarities predict that the rate of pro-sociality in the Take condition should be lower than the rate of pro-sociality in the Boost condition, which is exactly the opposite of what we found in Study 1. In retrospection, this is not surprising as, while SentiWordNet measures the polarities of synsets (i.e., sets of synonyms — same semantics) that are still relatively general, we use the corresponding words in a particular context. Hence, it is possible that people perceive the words used to describe the available actions in this given, precise, context, with different polarities from those of the corresponding synset in SentiWordNet. For this reason, in Study 2 we went beyond SentiWordNet and collected self-reported context-dependent perceptions of all the words used in Study 1, using two measures, the moral judgment and the feeling associated to each of these words. We have shown that both these measures predict pro-sociality in Study 1.

Related to our work is the literature about framing effects in the Dictator game. A number of papers have explored whether the Dictator Game in the Take frame gives rise to greater pro-sociality than the same game in the Give frame.333These papers do not manipulate only the words used to describe the available actions, but also the status quo, that is, who owns the initial endowment: in the Give frame, the endowment is initially given to the dictator, which has to decide how much of the endowment, if any, to give to the recipient; in the Take frame, the endowment is initially given to the recipient, and the dictator has to decide how much of the endowment, if any, to take from the recipient. We have shown in Section 2.2.4 that our framing effect is not driven by a potential endowment effect. Krupka and Weber krupka2013identifying found that, indeed, participants tend to be more pro-social in the Take frame than in the Give frame. Moreover, they showed that the rate of pro-sociality can be predicted by what they called “social appropriateness” of an action, via the utility function , where is the utility associated to the material payoff corresponding to action , is an individual parameter representing the extent to which individual cares about doing what thinks is the appropriate thing to do, and is the degree of appropriateness of action (which is assumed to be independent of ). However, the framing effect when passing from the Take frame to the Give frame was not replicated by two other works dreber2013people; goerg2017framing. This mixed results thus left open the question about whether framing effects actually exist in the Dictator game and whether they can be actually explained in terms of “social appropriateness” or similar constructs. Our work sheds light on this topic. We indeed found that it is enough to change only one word in the instructions of an extreme Dictator game to alter people’s behavior, and that, somewhat in line with Krupka and Weber krupka2013identifying, that the rate of pro-sociality can be predicted by Judgm(w)-Judgm() and by Feel(w)-Feel(). However, in line with dreber2013people; goerg2017framing we found that the Take frame does not give rise to a rate of pro-sociality significantly higher than the Give frame.

Related to our work is also the literature on language-based games. In fact, the idea that the language used to describe the available strategies can affect people’s behavior is not new in the literature, also among theorists. Bjorndahl, Halpern & Pass bjorndahl2013language argue that outcome-based preferences do not suffice to explain some human interactions, which are instead best understood by defining the utility function on the underlying language used to describe the game. Motivated by this observation, they define the class of language-based games and study a generalization of Nash equilibrium and rationalizable strategies on these games. Although theoretically important, this paper does not provide any empirical evidence showing that the words being used to describe the actions actually impact behavior. Our empirically-oriented approach contributes to addressing this question, by showing that the language used to describe an extreme Dictator game can indeed impact people’s behavior.

Among experimentalists, a handful of papers have recently shown that minor changes in the instructions of a decision problem can significantly impact people’s behavior capraro2018right; eriksson2017costly; krupka2013identifying; tappin2018doing. With the exception of Tappin and Capraro tappin2018doing, none of these studies changed only one word in the instructions. Tappin and Capraro tappin2018doing showed that it suffices to change only one word in the instructions of the Trade-Off game (a decision problem in which there is a tension between equity and efficiency) to change people’s behavior. In this optic, the current work suggests that the results of Tappin and Capraro tappin2018doing are not specific to situations in which there is a tension between equity and efficiency, but they extend to situations in which there is a tension between self-interest and other-interest.

Of course, our results should be interpreted within the natural limitations of our experiment: an extreme variant of the Dictator Game with $0.50 at stake. In this context, our results make the conceptual point that classical outcome-based behavioral models fail to predict people’s behavior, and one should turn to language-based models. It is well possible that the passage to language-based models would be unnecessary with other games or at higher stakes. For example, although previous research has found very little evidence of stake effects in several games involving pro-sociality when stakes are not too high forsythe1994fairness; carpenter2005effect; johansson2005does; branas2018gender; larney2019stake, other studies have found evidence that pro-sociality decreases at very high stakes carpenter2005effect; andersen2011stakes. Along these lines, it might be possible that, when facing very high stakes, people’s behavior becomes less influenced by the words being used to describe the available actions. Understanding the boundary conditions of the effect of the words used to describe the available actions on people’s behavior is a primary direction for future research.

Nonetheless, we can safely conclude that language-based models should be considered as a serious alternative to outcome-based models at least in some contexts. In such settings, we believe that building bridges from behavioral science to computational linguistics can greatly improve our understanding of human decision making.

References

Appendix A Experimental Instructions of Study 1

a.1 Steal vs Don’t steal condition

There are 50 cents available. You have to choose between two possible actions:

  • STEAL FROM THE OTHER PARTICIPANT: In which case, you get the 50 cents and the other participant gets 0 cents;

  • DON’T STEAL FROM THE OTHER PARTICIPANT: In which case, you get 0 cents and the other participant gets 50 cents.

The other participant has no choice and will be paid according to your decision. No deception is used. You and the other participant will be paid according to your decision.

Here are some questions to ascertain that you understand the rules. Remember that you have to answer all of these questions correctly in order to get the completion code. If you fail any of them, the survey will automatically end and you will not get any payment.444A skip logic in the survey eliminated from the survey automatically all participants providing the wrong answer

  • What choices should YOU make in order to maximize YOUR gain? [Available answers: Steal from the other participant – Don’t steal from the other participant.]

  • What choice should YOU make in order to maximize the OTHER PARTICIPANT’s gain? [Available answers: Steal from the other participant – Don’t steal from the other participant.]

Congratulations, you passed all comprehension questions. It is now time to make your choice.

  • What is your choice? [Available options: Steal from the other participant (50 cents for you, 0 cents for the other participant) / Don’t steal from the other participant (0 cents for you, 50 cents for the other participants)]

a.2 Don’t steal vs Steal condition

Identical to the previous one, with the only difference that the word “Steal” was replaced by “Don’t Steal”, and the words “Don’t steal” were replaced by “Steal”. Payoffs were changed accordingly.555An identical scheme has been adopted for the Take vs Don’t take, Don’t take vs take, Demand vs Don’t demand, Don’t demand vs Demand, Give vs Don’t give, Don’t give vs Give, Donate vs Don’t donate, Don’t donate vs Donate, Boost vs Don’t boost, and Don’t boost vs Boost conditions.

Appendix B Experimental Instructions of Study 2

b.1 Moral judgment condition

[The following questions were asked in random order]

Steal screen

Imagine that there are 50 cents available and that you have to choose between two possible actions:

Steal from the other participant: In which case, you get the 50 cents and the other participant gets 0 cents

Don’t steal from the other participant: In which case, you get 0 cents and the other participant gets 50 cents.

Having this situation in mind, please answer the following questions [presented in random order]:

From a moral point of view, how would you judge the choice: to steal? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

From a moral point of view, how would you judge the choice: not to steal? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]


Take screen

Imagine that there are 50 cents available and that you have to choose between two possible actions:

Take from the other participant: In which case, you get the 50 cents and the other participant gets 0 cents

Don’t take from the other participant: In which case, you get 0 cents and the other participant gets 50 cents.

Having this situation in mind, please answer the following questions [presented in random order]:

From a moral point of view, how would you judge the choice: to take? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

From a moral point of view, how would you judge the choice: not to take? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]


Demand screen

Imagine that there are 50 cents available and that you have to choose between two possible actions:

Demand from the other participant: In which case, you get the 50 cents and the other participant gets 0 cents

Don’t demand from the other participant: In which case, you get 0 cents and the other participant gets 50 cents.

Having this situation in mind, please answer the following questions [presented in random order]:

From a moral point of view, how would you judge the choice: to demand? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

From a moral point of view, how would you judge the choice: not to demand? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]


Give screen

Imagine that there are 50 cents available and that you have to choose between two possible actions:

Give from the other participant: In which case, you get the 0 cents and the other participant gets 50 cents

Don’t give from the other participant: In which case, you get 50 cents and the other participant gets 0 cents.

Having this situation in mind, please answer the following questions [presented in random order]:

From a moral point of view, how would you judge the choice: to give? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

From a moral point of view, how would you judge the choice: not to give? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]


Donate screen

Imagine that there are 50 cents available and that you have to choose between two possible actions:

Donate from the other participant: In which case, you get the 0 cents and the other participant gets 50 cents

Don’t donate from the other participant: In which case, you get 50 cents and the other participant gets 0 cents.

Having this situation in mind, please answer the following questions [presented in random order]:

From a moral point of view, how would you judge the choice: to donate? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

From a moral point of view, how would you judge the choice: not to donate? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]


Boost screen

Imagine that there are 50 cents available and that you have to choose between two possible actions:

Boost from the other participant: In which case, you get the 0 cents and the other participant gets 50 cents

Don’t boost from the other participant: In which case, you get 50 cents and the other participant gets 0 cents.

Having this situation in mind, please answer the following questions [presented in random order]:

From a moral point of view, how would you judge the choice: to boost? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

From a moral point of view, how would you judge the choice: not to boost? [Available answers: Extremely wrong / Somewhat wrong / Neutral / Somewhat right / Extremely right]

b.2 Sentiment condition

This condition was identical to the Moral judgment condition, with the only difference that the questions were, e.g.,

How would you describe your feeling if you choose: to steal? [Available answers: Extremely negative / Somewhat negative / Neutral / Somewhat positive / Extremely positive]

How would you describe your feeling if you choose: not to steal? [Available answers: Extremely negative / Somewhat negative / Neutral / Somewhat positive / Extremely positive]