Morality is a set of principles to distinguish between right and wrong. Shared moral values form the social and cultural norms that unite social groups dehghani2016purity. Moral Foundations Theory (MFT) haidt2004intuitive; haidt2007morality provides a theoretical framework for analyzing different expressions of moral values. The theory suggests that there are at least five basic moral values, emerging from evolutionary, social, and cultural origins. These are referred to as Moral Foundations (MFs), each with a positive and a negative polarity, and include Care/Harm, Fairness/Cheating, Loyalty/Betrayal, Authority/Subversion, and Purity/Degradation (Table 1 provides details). Identifying MF in text is a relatively new challenge and past work has relied on lexical resources such as the Moral Foundation Dictionary DVN/SJTRBI_2009; fulgoni-etal-2016-empirical; xie2020text and annotated data johnson-goldwasser-2018-classification; lin2018acquiring; hoover2020moral.
Social and political science studies have repeatedly shown the correlation between ideological and political stances and moral foundation preferences DVN/SJTRBI_2009; wolsko2016red; amin2017association. For example, DVN/SJTRBI_2009, DVN/SJTRBI_2009 captures the correlation between political ideology and moral foundation usage, showing that Liberals have a preference for Care/Harm and Fairness/Cheating while Conservatives use all five. Our main intuition in this paper is that even when different groups use the same MF, the moral sentiment would be directed towards different targets. To clarify, consider the following tweets discussing the Affordable Healthcare Act (ACA, Obamacare).
[raster columns=1,raster equal height=rows,raster valign=top, size=small]
@SenThadCochran and I are working to protect
MS small businesses from more expensive
#Obamacare mandates. [colback=blue!15!white,colframe=blue!75!black,nobeforeafter, title=] The ACA was a life saver for the more than 130 million Americans with a preexisting condition – including covid now. Republicans want to take us back to coverage denials. While both tweets use the Care/Harm MF, in the top tweet (Conservative) the ACA is described as causing Harm, while in the bottom (Liberal), the ACA is described as providing the needed Care.
Our main contribution in this paper is to introduce morality frames, a representation framework for organizing moral attitudes directed at different targets, by decomposing the moral foundations into structured frames, each associated with a predicate (a specific MF) and typed roles. For example, the morality frame for Care/Harm is associated with three typed roles: entity providing care, entity needing the care, and entity causing harm. We focus on analyzing political tweets, each describing an eliciting situation that evokes the moral sentiment, and map the text to a MF, and the entities appearing in it to typed roles. Given tweets by different ideological groups discussing the same real-world situation, morality frames can provide the means to explain and compare the attitudes of the two groups. We build on the MF dataset by johnson-goldwasser-2018-classification, johnson-goldwasser-2018-classification consisting of political tweets, and annotate each tweet with MF roles for its entities.
Identifying moral roles from text in our setting requires inferences based on political knowledge, mapping between the author’s perspectives and the judgements appearing in the text. For example, Donald Trump is likely to elicit a negative moral judgement from most Liberals and a positive one from most Conservatives, regardless of the specific moral foundation that is evoked. From a technical perspective, our goal is to model these kind of dependencies in a probabilistic framework, connecting MF and roles assignments, entity-specific sentiment polarity and repeating patterns within ideological groups (while our focus is U.S. politics, these settings could be easily adapted to capture patterns based on other criteria). We formulate these dependencies as a structured learning task and compare two relational learning frameworks, PSL bach:jmlr17 and DRaiL pacheco-goldwasser-2021-modeling. Our experiments demonstrate that modeling these dependencies, capturing political and social knowledge, result in improved performance. In addition, we conduct a thorough ablation study and error analysis to explain their impact on performance.
Finally, we demonstrate how entity-based MF analysis can help capture perspective differences based on ideological lines. We apply our model to tweets by members of Congress on the issue of Abortion and the 2021 storming of the US Capitol. Our analysis shows that while Conservative and Liberal tweets target the same entities, their attitudes are often conflicting.
2 Related Work
Usage of sociological theories like the Moral Foundation Theory (MFT) (haidt2004intuitive; haidt2007morality) and Framing (entman1993framing; chong2007framing; boydstun2014tracking)
in Natural Language Processing tasks has gained significant interest. The Moral Foundation Theory (MFT) has been widely used to study the effect of moral values on human behavioral patterns, such as charitable donationshoover2018moral, violent protests mooijman2018moralization and social homophily dehghani2016purity. Framing is a strategy used to bias the discussion on an issue towards a specific stance by emphasizing certain aspects that prime the reader to support the stance. Framing is used to study the political bias and polarization in social and news media (tsur-etal-2015-frame; baumer-etal-2015-testing; card-etal-2015-media; field-etal-2018-framing; demszky2019analyzing; fan-etal-2019-plain; roy-goldwasser-2020-weakly). Moral Foundation Theory (MFT) is frequently used to analyze political framing and agenda setting. For example, fulgoni-etal-2016-empirical analyzed framing in partisan news sources using MFT, dehghani2014analyzing studied the difference in moral sentiment usage between liberals and conservatives. brady2017emotion found that moral/emotional political messages are diffused at higher rates on social media.
Previous works have also contributed to the detection of moral sentiments. johnson-goldwasser-2018-classification showed that policy frames boydstun2014tracking help in moral foundation prediction, hoover2020moral proposed a dataset of k tweets annotated for moral foundations, lin2018acquiring used background knowledge for moral sentiment prediction, xie2020text proposed a text based framework to account for moral sentiment change, and garten2016morality
used pretrained distributed representations of words to extend the Moral Foundations DictionaryDVN/SJTRBI_2009 for detecting moral rhetoric.
While existing works study MFT at the issue and sentence level, roy-goldwasser-2021-analysis showed that there is a correlation between entity mention and the sentence-level moral foundation in the tweets by the U.S. politicians. We extend this work by studying MFT directly at the entity level. Hence, our work is broadly related to the works on entity-centric affect analysis (deng-wiebe-2015-joint; field2019entity; park2020multilingual).
Combining neural networks and structured inference was explored for traditional NLP tasks such as dependency parsingchen-manning-2014-fast; weiss-etal-2015-structured; andor-etal-2016-globallylample-etal-2016-neural and sequence labeling systems ma-hovy-2016-end; zhang-etal-2017-semi. Recently, these efforts have expanded to discourse-level tasks such as argumentation mining niculae-etal-2017-argument; widmoser-etal-2021-randomized, event/temporal relation extraction han-etal-2019-joint and discourse representation parsing liu-etal-2019-discourse. Following this trend, pacheco-goldwasser-2021-modeling introduced DRaiL, a general declarative framework for deep structured prediction, designed specifically for NLP tasks. In this paper, we use DRaiL to model moral foundations and morally-targeted entities in tweets, and find an improvement over other non-neural probabilistic graphical modeling frameworks bach:jmlr17.
|Care/Harm: Care for others, generosity, compassion, ability to feel pain of others, sensitivity to suffering of others, prohibiting actions that harm others.||
|Fairness/Cheating: Fairness, justice, reciprocity, reciprocal altruism, rights, autonomy, equality, proportionality, prohibiting cheating.||
|Loyalty/Betrayal: Group affiliation and solidarity, virtues of patriotism, self-sacrifice for the group, prohibiting betrayal of one’s group.||
|Authority/Subversion: Fulfilling social roles, submitting to authority, respect for social hierarchy/traditions, leadership, prohibiting rebellion against authority.||
|Purity/Degradation: Associations with the sacred and holy, disgust, contamination, religious notions which guide how to live, prohibiting violating the sacred.||
3 Identifying Entity-Centric Moral Roles
3.1 Morality Frames
MFT defines a convenient abstraction of the moral sentiment expressed in a given text. Morality Frames build on MFT and provide entity-centric moral sentiment information. Rather than defining negative and positive MF polarities (e.g., CARE or HARM), we use the five MFs as frame predicates, and associate positive and negative entity roles with each frame. As described in Table 1, these roles capture information specific to each MF. For example, entity causing harm, is a negative sentiment role, associated with the CARE/HARM morality frame. The entities filling these roles can be individuals, collective entities, objects, activities, concepts, or legislative elements.
3.2 Data Collection
We build on the dataset proposed by johnson-goldwasser-2018-classification, consisting of tweets by US politicians posted between 2016 and 2017. A subset of it (2K out of 93K) is annotated for Moral Foundations and Policy Frames boydstun2014tracking. The tweets focus on six politically polarized issues: immigration, guns, abortion, ACA, LGBTQ, and terrorism, and the party affiliations of the authors are known. We consider only labeled moral tweets, and choose the most prominent MF annotation for each tweet (some tweets are annotated for a secondary MF). Since the data contains only few examples of the Purity/Degradation moral foundation, we collected more examples from the unlabeled segment and manually annotated them. Table 2 shows the statistics of the final dataset. The annotation process and per-topic distribution of tweets are outlined in Appendix A.
|Morals||# of Tweets||Ideology|
3.3 Entity Roles Annotation
We annotate each tweet for entities and their associated moral roles.
Annotation Schema: We set up a QA task on Amazon Mechanical Turk. Annotators were given a tweet, the associated MF label and its description. They were then presented with multiple questions, and asked to mark the answers, corresponding to our entity roles, in the tweet. Table 4 shows the questions asked for the Care/Harm case. We asked additional questions to assess the annotators’ understanding of the task. The questions for other moral foundations can be found in Appendix B.1.
|Entity Type||Question Asked to the Annotators|
Target of care/harm
|Which entity needs care, or is being harmed?|
|Entity causing harm||Which entity is causing the harm?|
|Entity provid. care||Which entity is offering/providing the care?|
|Morals||# Tweets||# Ann/Tw||Agreement (SD)|
|Care/Harm||27||589||3||3||0.63 (0.5)||0.70 (0.5)|
|Fairness/Cheating||30||247||5.03||2.92||0.55 (0.4)||0.69 (0.5)|
|Loyalty/Betrayal||40||203||5.67||2.89||0.58 (0.3)||0.63 (0.5)|
|Authority/Subversion||50||466||4.58||2.92||0.55 (0.3)||0.60 (0.5)|
|Purity/Degradation||10||36||6||3||0.51 (0.4)||0.77 (0.6)|
refers to number of annotations per tweet and SD refers to Standard Deviation.
Quality Assurance: We provided the annotators with work-through examples and hints with each question about the entity type. The interface allowed them to mark a segment of the text with one moral role only. To further improve the quality, we did the annotation in two phases. In the annotator selection phase, we released a small subset of tweets for annotation. Based on the annotations, we assigned qualifications to high performing workers and released the rest of the tweets only to them. We awarded the annotators per tweet. We define agreement among annotators if they mark the same segment in the text as having the same entity-role. We calculate the agreement among multiple annotators using Krippendorff’s krippendorff2004measuring, where means perfect agreement, means inverse agreement, and is the level of chance in a tweet. Table 4 shows that the average agreement increased in the final stage. Note that the annotator agreement (Krippendorff’s ) is calculated by comparing the character by character agreement between annotations. For example, if one annotator has marked ‘President Trump’ as an answer in a tweet, and another has marked ‘Trump’ as the answer, it will be considered as agreement on the characters ‘Trump’ but disagreement on ‘President’, although they really did not disagree on their annotations. This makes the agreement measurement very strict. Regardless, we still got very good average agreement among annotators in the final annotation step. We further refine the annotations by taking majority voting as described in the following section.
Annotation Results: A tweet is annotated by at least three annotators. We define a text span to be an entity E, having a moral role M, in tweet T, if it is annotated as such by at least two annotators. This way, we found (T, E, M) tuples.
To compare the partisanship of MFs and MF roles, we calculate the z-scores for the proportion of MFs and MF roles in the left and right, and consider it as partisan score (- right, + left). The partisan scores for common MFs and their corresponding most partisan (role: entity) tuples are shown in Table5. The results of this analysis align with our intuition, moral sentiment towards entities can be more indicative of partisanship than the high-level MFs. In Table 6, we present the top-5 most used entities per role by political party for Care/Harm. We can see that the target entities of moral roles vary significantly across parties. Details for other MFs and z-scores are in Appx. B.2.
|Topic||Common MF||Most Partisan Role:Entity|
|(Partisan score)||Right (-)||Left (+)|
|Entity Types||Most Frequent Entities in Left||Most Frequent Entities in Right|
|Target of care/harm||20 million Americans; our families; woman; innocent people; #domesticviolence victims||law-abiding Americans; victims and their families; small businesses; patients; Paris|
|Entity causing harm||gun show loopholes; gun violence; terrorist attack; mass-shootings; suspected terrorists||Radical Islamic terrorists; #Obamacare mandates; Brussels attacks; #ISIS; ISIL-Inspired Attacks|
|Entity providing care||gun show loophole bills; Affordable Care Act; #ImmigrationReform; Democrats; commonsense gun legislation||@RepHalRogers: Bill; @HouseGOP; Senate; @WaysandMeansGOP; HR 240|
We propose a relational learning model for identifying morality frames in text. We begin by defining our relational structure (Sec. 4.1) and proceed to describe its implementation using relational learning tools (Sec. 4.2).
4.1 Relational Model for Morality Frames
Statistical Relational Learning (SRL) methods attempt to model a joint distribution over relational data, and have proven useful in tasks where contextualizing information and interdependent decisions can compensate for a low number of annotated examples(deng-wiebe-2015-joint; johnson2016identifying; johnson-goldwasser-2018-classification; subramanian2018hierarchical)
. By breaking down the problem into interdependent relations, these approaches are easier to interpret than end-to-end deep learning techniques.
We propose a joint prediction framework of morality frames, modeling the dependency between MF labels and moral roles instances. Following SRL conventions richardson2006markov; bach:jmlr17, we use first-order-logic to describe relational properties. Specifically, a logical rule is used to define a probabilistic scoring function over the relation instances appearing in it, the full description appears in Section 4.2.
In addition, we make the observation that both moral foundations and entities’ moral roles depend on external factors that go beyond the text, such as author information and party affiliation. Previous work has shown that explicitly modeling party affiliation and the topics discussed are helpful for predicting moral foundations johnson-goldwasser-2018-classification. For this reason, we condition both the moral foundation and moral roles on this additional information, as shown in the rules below.
Rules , condition the moral foundation label () and moral foundation role label () on the tweet () and entity (), while , condition on the ideology of the author () and the topic of the tweets (). Concretely, can be translated as “if a tweet has author ideology , topic , and mentions entity , the entity will have moral role ”. Other rules can be translated similarly. Then, we explicitly model the dependencies among different decisions using the following three constraints.
() Consistency between MF label and roles: While rules , predict the MF labels, and ,
predict the role labels, these two predictions are interdependent. Knowing the MF of a tweet limits the space of feasible roles. Likewise, knowing the role of an entity in a tweet will directly give us the MF label. For example, the presence of an entity frequently used as a harming entity indicates a higher probability of the MF label ‘Care/Harm’. We model the dependency between these two decisions using constraint, which can be translated as “if an entity , mentioned in tweet , has role , tied to MF , then tweet will have MF label ”.
() Different roles for different entities in the same tweet: Our intuition is that if multiple entities are mentioned in the same tweet, they are likely to have different roles. While this may not always hold true, we use this constraint to prevent the model from relying only on textual context, and assigning the same role to all entities.
() Consistency in the polarity of sentiment towards an entity within a political party: Intuitively, role types have a positive or negative sentiment associated to them. For example, an entity causing harm and an entity doing betrayal carry negative sentiment. Intuitive polarity for each MF role can be found in Appendix C.1. Given the highly polarized domain that we are dealing with, we assume that regardless of the MF, an entity will likely maintain the same polarity when mentioned by a specific political party across the same topic. Constraint encourages this consistency, and it can be translated as: “if two tweets , are written by authors of the same political ideology, on the same topic, and mention the same entity , then the polarity of the roles and of in both tweets will likely be the same.” We consider two entities to be the same if they are an exact lexical match, and leave entity clustering for future work.
4.2 Frameworks for Relational Learning
In this work, we experiment with two existing frameworks for modeling relational learning problems - (1) Probabilistic Soft Logic (PSL) bach:jmlr17 and (2) Deep Relational Learning (DRaiL) pacheco-goldwasser-2021-modeling. Both PSL and DRaiL are probabilistic frameworks for specifying and learning relational models using weighted logical rules, specifically horn clauses of the form . Weights indicate the importance of each rule in the model, and they can be learned from data. Predicates
can be closed, if they are observed, or open if they are unobserved. Probabilistic inference is used over all rules to find the most probable assignment to open predicates. The main differences between PSL and DRaiL are - (a) In DRaiL, each rule weight is learned using a neural network, which can take arbitrary input representations, while in PSL a single weight is learned for each rule, and expressive classifiers can only be leveraged as priors; (b) DRaiL defines a shared relational embedding space, by specifying entity and relation specific encoders that are reused across all rules. In both frameworks, rules are transformed into linear inequalities corresponding to their disjunctive form, and MAP inference is defined as a linear program.
In PSL, rules are compiled into a Hinge-Loss Markov random field, defined over continuous variables. Weights can be learned using maximum likelihood estimation, maximum-pseudolikelihood estimation, or large-margin estimation. In DRaiL, rule weights are learned using neural networks. Parameters can be learnedlocally, by training each neural network independently, or globally, by using inference to ensure that the scoring functions for all rules result in a globally consistent decision. To learn global models, DRaiL can also employ maximum likelihood estimation or large-margin estimation. Details regarding both frameworks can be found in Appendix C.2.
5 Experimental Evaluation
The goal of our relational learning framework is to identify morality frames in tweets by modeling them jointly, and derive interpretable relations between them and other contextualizing information. In this section, we compare the performance of our model with multiple baselines, and present a detailed error analysis. Then, we collect tweets on one topic (Abortion) and one event (2021 US Capitol Storming) written by US Congress members and analyze the discussion.111Collected from https://github.com/alexlitel/congresstweets We identify the morality frames in these tweets using our best model.
5.1 Experimental Settings
We experiment with PSL and DRaiL for modeling the rules presented in Section 4.1. In DRaiL, each rule is associated with a neural architecture, which serves as a scoring function to obtain the rule weight . In the case of rules and , which map tweets and entities to labels, we use a BERT encoder devlin-etal-2019-bert with a classifier on top. We use task-adaptive pretraining for BERT gururangan-etal-2020-dont, and fine-tune it on a large number of unlabeled tweets. In the case of rules and , that incorporate ideology and topic information, we learn topic and ideology embeddings with one-layer feed-forward nets over their one-hot representations. Then, we concatenate the output of BERT with the topic and ideology embeddings before passing everything through a classifier. On the other hand, PSL directly learns a single weight for each rule. Given that our rules are defined over complex inputs (tweets), we use the output of the locally trained neural nets as priors for PSL, by introducing additional rules of the form . This approach has been successfully used in previous work dealing with textual inputs sridhar-etal-2015-joint. Note that while PSL can only leverage these classifiers as priors, DRaiL continues to update the parameters of the neural nets during learning.
We model constraint , aligning the MF and role predictions, and , aligning role polarity, as unweighted hard constraints in both frameworks. For constraint
, we learn a weight to encourage different entities in a tweet to have different roles. PSL learns a weight directly over this rule, while in DRaiL we use a feed-forward net over the one-hot vector of the relevant MF. We compare our relational models with the following baselines.
Lexicon Matching: Direct keyword matching using the MF Dictionary (MFD) DVN/SJTRBI_2009
and a PMI-based lexicon extracted from the dataset byjohnson-goldwasser-2018-classification.
Sequence Tagging: We set the MF role prediction task as a sequence tagging problem, and map each entity in a tweet to a role label. We use a BiLSTM-CRF huang2015bidirectional over the full tweet, and use the last time-step in each entity span as its emission probability.
End-to-end Classifiers: We map the text and entities, and other contextualizing features (e.g. topic), to a single label. We compare BERT-base and task adaptive pretraining (BERT-tapt) by using a whole-word-masking objective over the large set of unlabeled political tweets.
: We define a single BERT encoder, and a single ideology and topic embedding that is shared across the two tasks. Task-specific classifiers are used on top of these representations. Then, the loss functions are added as. We set .
|MFD + PMI||-||39.78||-||42.12|
|+ Ideo + Issue||54.81||66.13||62.83||68.34|
|+ Ideo + Issue||52.11||63.44||63.61||68.61|
|Models||Weighted F1||# of Errors|
|Swap (E1)||MFs (E2)||Role (E3)|
|+ All constr||64.98||74.39||248||736||138|
We perform 3-fold cross validation over the dataset introduced in Section 3, and show results for MF and role prediction in Table 8. First, we observe that leveraging unlabeled data for task-adaptive pretraining improves performance. Then, we find that relational models that use probabilistic inference outperform all of the other baselines for both tasks. Further, we find that modeling rules using neural nets in DRaiL, and learning their parameters with global learning, performs better than using them as priors and learning a single weight in PSL. We also include results by fixing the gold labels for the MF prediction, and refer to this as a skyline. Unsurprisingly, having perfect MF information improves results for roles considerably. In this case, the candidates for each entity are reduced from 16 possible assignments to 3 or 4, which results in a much easier task. Details regarding all baselines, hyper-parameters, task-adaptive pretraining, and results per class can be found in Appendix D. Code and dataset can be found at https://github.com/ShamikRoy/Moral-Role-Prediction.
5.2 Ablation Study and Error Analysis
We perform an ablation study, evaluating different versions of our model by adding and removing constraints and analyzing corresponding errors. To study the effect of different rules and constraints on role prediction, we define three types of errors:
(E1) Polarity Swap: when the role of an entity with one polarity (positive/negative) is identified as one role of the opposite polarity.
(E2) Mixed MFs: when different entities of the same tweet are identified with roles from a MF other than the gold label of the tweet.
(E3) Same Roles: all of the entities in a tweet are identified to have the same role when the gold labels are different.
The analysis is shown in Table 8. First, we see that constraint , aligning the two decisions, does most of the heavy lifting and reduces error (E2) in all cases. Enforcing consistent polarities with further improves performance and reduces error (E1), for which it is designed for. also reduces error (E3) in some models. Encouraging entities to have different roles with does not improve the overall performance, but it helps to reduce error (E3) when combined with . We use a soft version of , so it is not strictly enforced. We find that roles with negative sentiments are easier for the model to identify (Appendix D.4). Note that every MF has only one role with negative sentiment, and the model does not swap role labels with different sentiments frequently (E1). Therefore, determining the correct positive role is more challenging.
5.3 Predicting Morality Frames on New Data
To analyze the political discussion using the moral sentiment towards entities, we collected more tweets from US politicians on the topic of Abortion and around the storming of the US Capitol on Jan. 6, 2021. The Abortion tweets are from 2017 to Feb. 2021. For the US Capitol incident, we collected tweets 7 days before and after the event, with the goal of studying any change in sentiment towards entities. We took noun phrases occurring at least times, manually filtered out non-entities, and grouped different mentions of the same entity (Appendix D.6). We collected tweets that mentioned these entities. Statistics for the resulting data can be found in Table 10. We re-trained our model using all of our labeled data, and predicted the morality frames for each tweet in the new dataset.
We performed human evaluation on the predictions for this new data by randomly sampling tweets from each issue. This resulted in and (tweet, entity) pairs for Abortion and US Capitol, respectively. This procedure resulted in an accuracy of MF prediction of for each issue, and a role prediction accuracy of for Abortion, and for the US Capitol incident. We found that entities that appear less in the training data have low precision for the role prediction (See Table 10). Note that the US Capitol event was not observed during training, which makes it more challenging. For Abortion, we observed that Democrats mention the entity Women most, and of the time the predicted MF role is target of care/harm or fairness/cheating, and it is never assigned a negative role (possibly because of constraint ). For Republicans, we observed the same pattern for the entity Life (Stats in Appendix D.8). However, in a few cases (2.4%) Life is predicted as the entity ensuring fairness/purity/care, justified authority or being loyal. While these roles carry a positive sentiment, they are intuitively wrong predictions for Life. We found out that for of such cases, there are multiple mentions of Life in the same tweet. Given that constraint encourages different roles for different entities in a tweet, this can be the source of this error. Examples for these cases can be found in Appendix D.9.
|Entity Types||Top Entities in Left||Top Entities in Right|
|Target of care||Citizens, Democracy, America||Capitol, Democracy, Police||America, Citizens||Capitol, America, Sicknick|
|Causing harm||Trump, Violence||Trump, Violence, Domest. terror.||-||Violence, Trump|
|Provide care||Congress, Biden, Democrats||Congress, Biden, Amendment||Congress, Trump||Police; Congress|
|Congress, Pelosi, Democrats||Congress, Amendment, Pence||Congress||Congress, Pence|
|Justified auth. over||-||Biden, Harris||-||-|
|Failing authority||Trump, GOP||Trump, Impeachment, GOP||Democrats, Trump, GOP||Trump, Dems., Impeachment|
|Failing auth. over||Democracy, Biden, McConnell||Democracy, Capitol, Nation||Pelosi, Citizens, America||Nation, Pelosi, Biden|
6 Analyzing Political Discussions
In this section, we first characterize the political discussion on Abortion using the predicted morality frames. Then, we analyze how an event impacts the moral sentiment towards entities by looking at the usage of MF roles before and after the 2021 US Capitol Storming for the different parties.
6.1 Characterizing Discussion on Abortion
Morality Frame Usage: We found out that the left uses Fairness/Cheating the most, while the right uses Purity/Degradation. Care/Harm is the second most frequent for both parties (Appendix E.1). To analyze MF role usage, we list the most frequent entities and their most frequent moral roles in Figure 1(a). The left portrays entities related to Reproduction Freedom as the target of Fairness/Cheating. While on the right, the top target of Purity/Degradation is Life. Both of them use Planned Parenthood frequently, but their sentiment towards it differs. To further examine this, we plot Planned Parenthood’s polarity graph in Figure 1(b). It shows that parties express opposite sentiments towards Planned Parenthood. These findings are consistent with known stances of democrats and republicans on this topic.
Entity-Relation Graph: We examine how the political discussion is framed by each party by looking at the sentiments expressed towards different entities, regardless of whether they use the same high level MF. We look at Care/Harm, which is frequently used by both parties, and take the two most used targets by each party. We then take the top three care providing and harming entities used in the same tweet as the target. We assign the most common role for each entity, and represent it in an entity-relation graph in Figure 0(a)-0(b). We can see that both democrats and republicans express care for Women, but the caring and harming entities vary highly across parties. For example, the left portrays Planned Parenthood as the caring entity, while the right portrays it as the harming entity. This analysis shows that, while there is overlap in the MFs used, the moral roles of entities can highlight the differences between parties in politically polarized discussions at an aggregate level.
6.2 Moral Response to US Capitol Storming
To analyze how the moral sentiment towards entities changed after the storming of the US Capitol on January 6, 2021, we look at the sentiment towards entities before and after the event. We found that Authority/Subversion and Care/Harm were the two most used moral foundations after the incident for both parties (Appendix E.1). In Table 11, we present the top three most frequent entities for role types under Care/Harm and Authority/Subversion, before and after the event. Entities appearing less than 15 times are omitted from this analysis. Our model predicted that, after the event, the left justified the authority of Mike Pence, and violence appeared as a harming entity even before the event occurred. On the right, Trump shifted from an entity providing care prior to the event, to a harming entity after the event. We show some relevant tweets and their corresponding predictions in Table 12. The entity-relation graph for each party after the event can can be found in Appendix E.2.
|[Ideology-Period] (Predicted MF) Tweet Text|
In this paper, we present the first study on Moral Foundations Theory at the entity level, by assigning moral roles to entities, and present a novel dataset of political tweets that is annotated for this purpose. We propose a relational model that predicts moral foundations and the moral roles of entities jointly, and show the effectiveness of modeling dependencies and contextualizing information for this task. Finally, we analyze political discussions in the US using these predictions, and show the usefulness of our proposed schema. In the future, we intend to study how morality frames and our relational framework can be applied in other settings, where contextualizing information is not observed.
We thank Nikhil Mehta, Rajkumar Pujari, and the anonymous reviewers for their insightful comments. This work was partially supported by an NSF CAREER award IIS-2048001.
8 Ethics Statement
To the best of our knowledge no code of ethics was violated throughout the annotations and experiments done in this paper. We used human annotation for annotating an existing dataset with new labels. We adequately acknowledged the dataset and its various properties are explained thoroughly in the paper. While annotating, we respected the privacy rights of the crowd annotators and we didn’t ask any personal details of the anonymous human annotators. They were informed that the task contains potentially sensitive political content. The crowd annotators were fairly compensated by rewards per annotation. We determined what is a fair amount of compensation by taking into consideration the feedback from the annotators and comparing our reward with other annotation tasks on the crowd-sourcing platform.
Appendix A Data Collection
Identification and Annotation of Tweets with ‘Purity/Degradation’:
To collect more tweets on Purity/Degradation, we took more examples from the unlabeled segment of the dataset (93K tweets). Then we filtered out tweets from it based on lexicon matching with Moral Foundation Dictionary for Purity/Degradation. Then two of the authors of this paper individually went over the tweets and selected tweets having purity/degradation as the primary moral foundation in them. The two authors had agreement on of the cases. Then we combined the two lists from two authors and in case of a disagreement we resolved it by discussion. In this manner we found tweets on Purity/Degradation. Then we annotate these tweets with Purity/Degradation with Policy Frames present in them in the same manner. Two authors of this paper annotated the tweets for Policy Frames individually. They had an agreement on of the cases about the primary policy frame in a tweet. Most of the time they had a disagreement in the cases where there are more than policy frame present in them. The authors resolved any disagreement by discussion.
Full Dataset Statistics:
The statistics of the full dataset can be found in Table 13.
|Morals||# of Tweets||Ideology||Topic|
Appendix B Data Annotation for Moral Roles
b.1 Questionnaire asked to the annotators for annotation of entity roles
The questionnaire asked to the annotators for all moral foundations can be found in Table 14.
|Moral||Entity Type||Question Asked to The Annotators|
|Target of care/harm||Which entity needs CARE, or is being HARMED?|
|Entity causing harm||Which entity is causing the HARM?|
|Entity providing care||Which entity is offering/providing the CARE?|
|N/A (additional question)||Fairness or cheating on what?|
|Target of fairness/cheating||Fairness for whom or who is being cheated?|
|Entity ensuring fairness||Who or What is ensuring fairness or in charge of ensuring fairness?|
|Entity doing cheating||Who or What is cheating or violating the fairness?|
|N/A (additional question)||What are the phrases invoking LOYALTY?|
|N/A (additional question)||What are the phrases invoking BETRAYAL?|
|Target of loyalty/betrayal||LOYALTY or BETRAYAL to whom or what?|
|Entity being loyal||Who or what is expressing LOYALTY?|
|Entity doing betrayal||Who or what is doing BETRAYAL?|
|N/A (additional question)||LEADERSHIP or AUTHORITY on what issue or activity?|
|Justified authority||Which LEADERSHIP or AUTHORITY is obeyed/praised/justified?|
|Justified authority over||If the LEADERSHIP or AUTHORITY is obeyed/praised/justified, then praised/obeyed by whom or justified over whom?|
|Failing authority||Which LEADERSHIP or AUTHORITY is disobeyed or failing or criticized?|
|Failing authority over||If the LEADERSHIP or AUTHORITY is disobeyed or failing or criticized, then failing to lead whom or disobeyed/criticized by whom?|
|Target of purity/degradation||What or who is SACRED, or subject to degradation?|
|Entity preserving purity||Who is ensuring or preserving the sanctity?|
|Entity causing degradation||Who is violating the sanctity or who is doing degradation or who is the target of disgust?|
b.2 Calculation of Partisanship and Most Frequent Entities by Entity Role
To determine the partisanship of the elements - (1) moral foundations, (2) (moral foundation role: entity), we use z-score measure of these elements in the two political ideologies (left, right). We calculate the z-score to evaluate - whether two groups (e.g., left and right) differ significantly on some single characteristic. In our case the characteristics are any element of type (1) or type (2) as described above. A positive z-score means it’s left-partisan and negative score means right-partisan.
Most frequent entities per moral role can be found in Table 15.
|Moral||Entity Type||Most Frequent Entity in Left||Most Frequent Entity in Right|
|Care/Harm||Target of care/harm||20 million Americans; our families; woman; innocent people; #domesticviolence victims||law-abiding Americans; victims and their families; small businesses; patients; Paris|
|Entity causing harm||gun show loopholes; gun violence; terrorist attack; mass-shootings; suspected terrorists||Radical Islamic terrorists; #Obamacare mandates; Brussels attacks; #ISIS; ISIL-Inspired Attacks|
|Entity providing care||gun show loophole bills; Affordable Care Act; #ImmigrationReform; Democrats; commonsense gun legislation||@RepHalRogers: Bill; @HouseGOP; Senate; @WaysandMeansGOP; HR 240|
|Target of fairness/cheating||woman, #LGBT community; all Americans; #FightForFamilies; other vulnerable people||the American people; small businesses; people; religious minorities in Syria and Iraq|
|Entity ensuring fairness||#SCOTUS decision; congress; bill to expand access; the DREAM Act; Equality Act||Senate; House; @RepHalRogers: Bill; House GOP; Supreme Courtś #HobbyLobby ruling|
|Entity doing cheating||anti-#LGBT laws; employer; HB 2; #HobbyLobby decision; Political attacks||#Obamacare legislation; fake ISIS passports; Planned Parenthood; the Pakistani Gov; enforcement loopholes|
|Target of loyalty/betrayal||#LGBT communities; gun safety measures; victims of #Orlando; women men and families; #StandwithPP||Paris terror attacks; senators; Israel; The American people; Syrian and Iraqi refugees|
|Entity being loyal||@SenWarren; @RepAdams; My colleagues; @SenateDems; House Democrats||@SenatorIsakson; @RepHalRogers|
|Entity doing betrayal||@HouseGOP extremist Members!; terrorists; The community of nations; @NRA||House|
|Justified authority||POTUS; SCOTUS; President Obamaś; Senate; @HouseDems||@HouseGOP; #Senate; #SCOTUS; Congress; Republicans|
|Justified authority over||Americans; 180 House Dems; nation; people||@SenateMajLdr; @RepHalRogers; #American; @SenateMajLdr; Inhofe|
|Failing authority||#HouseGOP; Congress; Republicans; SCOTUS; @SpeakerRyan||President Obama; POTUS; #Obamacare; @SCOTUS; @SecBurwellś|
|Failing authority over||Americans; @repjohnlewis; family; @SenFeinsteinś; women; Sen Dems||Americans; @SenateMajLdr; @HouseAppropsGOP @RepHalRogers; @SenateMajLdr McConnell; @SpeakerRyan|
|Target of purity/degradation||immigration; women||fetal body parts; lives of the unborn; baby girls|
|Entity preserving purity||N/A (no ngrams found that occurs more than 2 times.)||@SenDanCoats; #MarchforLife|
|Entity causing degradation||Donald Trump; Charleston church killings||Planned Parenthood; abortion providers; Radical Islamic terrorists|
b.3 Expressivity of bias of Moral Roles
To examine how well moral roles account for political standpoints when compared to moral foundations, we use the moral foundations (MF) and (moral foundation role, entity) (MFR) as one hot encoded features to classify the ideology of the tweet (left/right). The results are shown in Tab.16. Moral roles classify the ideology reasonably well compared to MF and BoW features, which proves the usefulness of the moral roles for capturing political perspectives.
|One-hot Encoded Features||# of Features||Macro F1|
|Moral Foundation (MF)||5||0.62|
|Moral Roles (MFR)||2021||0.77|
|Bag of Words (BoW)||2478||0.85|
Predicting ideology of tweet using Logistic Regression (3-fold CV).
Appendix C Modeling
c.1 Polarity of Moral Roles
Moral Roles with positive polarity: Target of care/harm, Entity providing care, Target of fairness/cheating, Entity ensuring fairness, Target of loyalty/betrayal, Entity being loyal, Justified authority, Justified authority over, Failing authority over, Target of purity/degradation, Entity preserving purity.
Moral Roles with negative polarity: Entity causing harm, Entity doing cheating, Entity doing betrayal, Failing authority, Entity causing degradation.
c.2 Relational Learning Frameworks
c.2.1 Probabilistic Soft Logic
PSL models are specified using weighted horn clauses, which are compiled into a Hinge-Loss Markov Random Field, a class of undirected probabilistic graphical model. In HL-MRFs, a probability distribution is defined over continuous values in the range of [0, 1], and dependencies among them are modeled using linear and quadratic hinge functions. This way, they define a probability density function:
where is the rule weight, is a normalization constant and is the hinge-loss potential corresponding to the instantiation of rule , represented by a linear function of and , and an optional exponent
. Inference in PSL is performed by finding a MAP estimate of the random variablesgiven evidence , this is done by maximizing the density function in Eq. 1 as . To solve this, they use Alternating Direction Method of Multipliers (ADMM), an efficient convex optimization procedure.
Weights can be learned through maximum likelihood estimation by using the structured perceptron algorithm. The partial derivative of the log of the likelihood function in Eq.1 above with respect to a parameter is:
where is the expectation under the distribution defined by . Given that computing this expectation is intractable, they approximate it by taking the values in the MAP state. This approximation makes this learning approach a structured variant of the voted perceptron. Note that alternative estimations are also supported. More details can be found in the original paper bach:jmlr17.
Rules in DRaiL can be weighted (i.e. classifiers, soft constraints) or unweighted (i.e. hard constraints). The collection of all rules represents the global decision. Rules are transformed into linear inequalities, corresponding to their disjunctive form, and MAP inference is then defined as an integer linear program:
Where each rule grounding , generated from template , with input features and predicted variables defines the potential , added to the linear program with a weight . DRaiL implements both exact and approximate inference to solve the MAP problem, in the latter case, the AD algorithm is used Martins_ad3.
In DRaiL, weights are learned using neural networks defined over parameter set . Parameters can be learned locally, by training each rule independently, or globally, by using inference to ensure that the scoring functions for all rules result in a globally consistent decision. To train global models using large-margin estimation, DRaiL uses the structured hinge loss:
Where represents the neural network associated with rule template , and parameter set . Here, y corresponds to the gold assignments, and corresponds to the prediction resulting from the MAP inference defined in Eq. 3. Note that alternative estimations are also supported. More details can be found in the original paper pacheco-goldwasser-2021-modeling.
Appendix D Experimental Evaluation
d.1 Task-Adaptive Pretraining
We do task-adaptive pretraining for BERT gururangan-etal-2020-dont, and fine-tune it on a large number of unlabeled tweets222Collected From:
. To select unlabeled tweets, we build a topic-specific lexicon of n-grams (n5) from our training dataset based on Pointwise Mutual Information (PMI) scores church-hanks-1990-word. Namely, for an ngram we calculate the point-wise mutual information (PMI) with label (e.g. topic), using the following formula.
Where is computed by taking all tweets with label and computing . Similarly, is computed by counting ngram over the set of tweets with any label. To construct the lexicon, we rank ngrams for each label based on their PMI scores.
We explore three pretraining objectives, described below. In all cases, models were initialized using BERT devlin-etal-2019-bert.
Masked Language Modeling: We randomly mask some of the tokens from the input, and predict the original vocabulary id of the masked word based on its context devlin-etal-2019-bert.
Whole Word Masking: Instead of masking randomly selected tokens, which may be sub-segments of words, we mask randomly selected words.
Moral Foundations Dictionary: We create a lexicon for each Moral Foundation from the dataset by johnson-goldwasser-2018-classification using the same PMI formula described above. We use the normalized PMI scores as a weight for each unigram, and assign a weight of 1 to unigrams in the Moral Foundation Dictionary (MFD)DVN/SJTRBI_2009. We score a tweet by summing the scores of words matching the lexicon. We take the highest scoring moral foundation for each tweet, and fine-tune a moral foundation classifier using this weakly annotated data.
We evaluate these objectives by performing the pre-training stage on the unlabeled data, and fine-tuning the encoder for our base task of leveraging only text to predict moral foundations and entity roles. Results can be seen in Tab. 17.
|Whole Word Masking||54.73||66.44||62.18||68.29|
d.2 Details About the Baselines
Lexicon Matching: We label a tweet with the moral foundation with maximum score based on lexicon matching. We use the Moral Foundation lexicons created in Appendix D.1. If there is no lexicon matching for a tweet, we assign a moral foundation label to it randomly. We experiment with combining and not combining the Moral Foundations Dictionary (MFD) DVN/SJTRBI_2009.
Sequence Tagging: We use a bidirectional LSTM with a CRF layer on top for tagging each entity a tweet with a moral role label. We run two LSTMs in forward and reverse direction of a tweet and concatenate the hidden states (50d) of two directions at each time step to get an embedding (100d) of the token. Given that entity spans are known, we use the last token in each entity as the entity embedding. This embedding is then used for the CRF layer.
End-to-end Classifiers: For the classification of moral foundations using BiLSTM, we run two opposite directional LSTMs over the GloVe word embeddings pennington2014glove of all tokens of a tweet, concatenate the hidden states (150d) of both LSTMs to get the embedding of a token (300d), then average the embeddings of all tokens to get a final embedding of a tweet. Then we use this embedding to classify the tweet in the moral foundation classes using a fully connected layer that maps the embedding to a moral foundation class. For moral foundation role classification using BiLSTM, we repeat the same process for an entity text to get its representation using BiLSTM. Then we concatenate the tweet representation and the entity representation and pass it through a hidden layer to get a representation of size 300. Then we use this representation for classification of moral foundation roles using a fully connected layer that maps the representation to the moral foundation role classes. For BERT based models, we use a classifier on top of the [CLS] representation. For role classification, we pass an input of the form [CLS] [tweet] [SEP] [entity]. We use the default parameters of the BERT-base-uncased huggingface implementation.
Multitasking Based: We define a single BERT encoder, and a single ideology and topic embedding that is shared across the two tasks. The three representations are concatenated and task-specific classifiers are used on top of them. Then, the loss functions are added as . We set
. For topic and ideology embeddings, we use feed-forward computations with 100 hidden layers and ReLU activations. For BERT we use the same configuration as the end-to-end classifiers.
d.3 Hyper-parameter Tuning and Validation Set Performance
For the underlying BERT, we use the default parameters of the hugging face implementation333https://github.com/huggingface/transformers. Other parameters can be observed in Table 18 (top). The bottom part of Table 18 shows the validation performance during the learning of the best performing model.
|Task||Param||Search Space||Selected Value|
|Local||Learning Rate||5e-5, 2e-5, 1e-5, 1e-6||2e-5|
|(Base)||Batch size||64, 32||32|
|Patience||3, 5, 10||10|
|Local||Learning Rate||1e-3, 5e-3, 5e-2, 1e-2||5e-3|
|(Soft||Batch size||64, 32||32|
|Constr.)||Patience||5, 10, 20||20|
|DRail||Learning Rate||5e-5, 2e-5, 1e-5, 1e-6||1e-6|
|Global||Batch size||-||Full instance|
|Patience||3, 5, 10||10|
|+ Ideo + Issue||52.52||60.31||64.02||66.58|
d.4 Results per Class
The per class classification results can be found in the Table 19.
d.5 Run-time Analysis
All experiments were run on a 4 core Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz machine with 64GB RAM and an NVIDIA GeForce GTX 1080 Ti 11GB GDDR5X GPU. Runtimes for our models can be found in Table 20
|epochs p/It||sec p/It|
d.6 Entity Groups
d.6.1 Entity Groups For Abortion
Brett Kavanaugh: brett kavanaugh, kavanaugh, stop kavanaugh
Roe v Wade: roe v wade, commit roe, protect roe
Planned Parenthood: plan parenthood, stand pp, pp, ppfa, ppact
Affordable Care Act: aca, affordable health care
Title X: title x, family planning, protect x
Gag Rule: gag rule, global gag rule, domestic gag rule
Democrats: democrat, dem, house democrat
Republicans: republican, house gop, senate gop, gop, gop leader
Trump care: trump care
Reproductive Right: reproductive right, woman reproductive right, reproductive freedom, reproductive justice, woman reproductive freedom
Reproductive Health: reproductive health, woman reproductive health, reproductive health care, reproductive care, reproductive healthcare, reproductive health service, comprehensive reproductive health care, abortion care
SCOTUS: scotus, save scotus, supreme court, supreme court justice, supreme court decision
Life: human life, innocent life, stand life, unborn child, unborn child protection, unborn baby, unborn, baby
Born Alive: bear alive abortion, bear alive
Late Term Abortion: late term abortion
Late Term Abortion Ban: week abortion ban
Born Alive Act: bear alive act
Abortion Provider: abortion provider, abortion clinic, abortion industry
Hyde Amendment: hyde, hyde amendment, bold end hyde
Healthcare Decision: health care decision, healthcare decision
d.6.2 Entity Groups for the 2021 US Capitol Hill Storming Event
Congress: congress, th congress
POTUS: potus, president
Donald Trump: trump, donald trump, real donald trump, president real donald trump
American People: american people
Democracy: american democracy, democracy
Joe Biden: joe biden, biden, president elect
Amendment: amendment, th amendment
Brian Sicknick: brian sicknick, sicknick
Nancy Pelosi: pelosi, speaker pelosi, nancy pelosi
Jamie Raskin: raskin
Capitol: capitol, capitol building, capitol hill, nation capitol
Impeachment: impeachment, impeach president
Kamala Harris: kamala harris, vice president elect
Capitol Police: capitol police, police officer, law enforcement, law enforcement officer
Mike Pence: pence, vp pence, mike pence
Mitch McConnell: mitch mcconnell, mcconnell
GOP: house gop, gop leader, gop, republican
Domestic Terrorism: domestic terrorist, domestic terrorism
National Security: national security, national guard
Democrats: dem, democrat, house democrat
Violence: violence, violent insurrection, violent attack, violent mob
White Supremacist: white supremacist
Fair Election: fair election
d.7 Human Evaluation on Test Data
Model Prediction Validation
We trained our model with all of our labeled data and used it to predict the moral foundations and entity roles of (tweet, entity) pairs in the new set. The validation set (randomly selected from train set) weighted F1 scores were and for moral foundations and roles, respectively. We validate our model’s prediction on the unseen dataset using human evaluation. We randomly sampled 50 tweets from each of the two test sets. This resulted in and (tweet, entity) pairs for Abortion and US Capitol, respectively. Note that one tweet may have entities. Then, we presented the predictions of moral foundations and entity roles to two graduate students and asked them if the prediction is correct or not. We found the Cohen’s Kappa cohen1960coefficient score between the annotators to be (moderate agreement) and (substantial agreement) in case of the moral foundations and entity roles, respectively. In case of a disagreement, we asked a third grad student to break the tie. The accuracy of the model for moral foundations was for each topic, while for roles it was and , for Abortion and US Capitol, respectively.
d.8 Distribution of MF Roles Assigned by the Model to ‘Women’ and ‘Life’
|(Predicted MF) Tweet||Comment|
|(CARE/HARM) The U.S. Senate is set to vote on commonsense legislation to protect unborn babies who can feel pain. Retweet if you Stand For Life!||MF prediction is ‘Care/Harm’, possibly because there is a notion of protecting babies. In MF role prediction, the model makes mistake when there are multiple mention of the same entity, possibly because of constraint but still assigns a positive role to ‘Life’, possibly because of constraint .|
|(LOYALTY/BETRAYAL) I will always, always, ALWAYS be proud to Stand 4 Life. I’m so grateful to @TXRightToLife for their support and pledge to never stop fighting for the unborn. Now, Texas, let’s get out and vote to #KeepTexasRed!||MF prediction is correct. In MF role prediction, the model makes mistake when there are multiple mention of the same entity, possibly because of constraint but still assigns a positive role to ‘Life’, possibly because of constraint .|
|(PURITY/DEGRADATION) Planned Parenthood is suing our state to expand their abortion-on-demand agenda. RT if you stand for life!||MF prediction is wrong. Still a positive role is assigned to ‘life’ and a negative role is assigned to ‘Planned Parenthood’, possibly because of constraint .|
|Moral Roles||% of time Assigned by the Model|
|Target of fairness/cheating||0.624|
|Target of care/harm||0.216|
|Failing authority over||0.076|
|Target of loyalty/betrayal||0.075|
|Target of purity/degradation||0.006|
|Entity providing care||0.001|
|Entity being loyal||0.001|
|Moral Roles||% of time Assigned by the Model|
|Target of purity/degradation||0.501|
|Target of care/harm||0.266|
|Target of loyalty/betrayal||0.151|
|Target of fairness/cheating||0.038|
|Failing authority over||0.018|
|Entity being loyal||0.008|
|Entity providing care||0.005|
|Entity preserving purity||0.004|
|Entity ensuring fairness||0.003|
|Justified authority over||0.002|
|Moral Foundations||%Used in Left||%Used in Right||%Predicted in Total|
|Morals||%Used in Left||%Used in Right||%Pred. in Total|
d.9 Qualitative Evaluation of MF Role Prediction for ‘Life’ in Republicans
Some tweets mentioning ‘Life’ by the Republicans and the predicted MF and MF roles are shown in Table 21.