A subfield of particular interest in network diffusion is influence maximization (IM), in which nodes and edges respectively represent individuals and connections. In its classical formulation, the goal is for the diffusion to reach the maximum number of nodes, while only disseminating the information to a few initial individuals, also called seeds.
Influence maximization is crucial in a variety of applications, including adopting new behavior in social networks [28, 20], viral marketing , propagation of information, disease spread [7, 26], and social recommendations . However, 
showed that finding the best seeds is NP-hard. Numerous approximations and heuristics have been proposed. Several approaches identify the most influential nodes to maximize the total number of people influenced[24, 22]. Using network structure, they consider central nodes (highest degree, closeness, betweenness, etc.) as the most influential nodes. Tackling the task from a different angle,  formulated the problem under the framework of discrete optimization. Their results significantly out-perform node selection heuristics, and showed that a provable approximate guarantee is obtainable. In this work, we adopt a fresh perspective: Adversarial Network Embeddings. We show our method is competitive with the previous influence maximization state-of-the-art, while significantly improving fairness.
our objective is fair influence maximization, in which the resulting set of influenced nodes is diverse with respect to sensitive attributes, such as age, gender, race etc..
Adding fairness objectives to influence maximization yields an even harder task, which none of the previously mentioned techniques can tackle. However, these constraints are very relevant in today’s society. For instance, consider the spread of a job opening, a loan advertisement, or even news. We want to make sure that belonging to a minority does not affect whether or not we would see this job opportunity or this loan offer. Moreover, as discussed in , receiving different types of news may be crucial to developing unbiased viewpoints.
that everyone has a chance to see critical news, independently of the communities to which they belong, could be a stepping stone in the fight against fake news.
Fair influence maximization has been recently studied [3, 30, 12]. However, these methods suffer from a trade-off between fairness and the influence maximization objective. Using our Adversarial Network Embeddings, we achieve a highly diverse set of influenced nodes with respect to multiple attributes, while still achieving state-of-the-art influence maximization objectives on large synthetic and real-world networks.
1.1 Our Contributions
We highlight the following contributions:
We use an embedding approach to tackle influence maximization. We believe this is the first time embeddings have been used for fair influence maximization.
We introduce Adversarial Network Embeddings to address fair influence maximizaiton. Using an autoencoder coupled with a discriminator in an adversarial setting, we obtain embeddings which are similarly distributed across sensitive attributes.
Our approach achieves state-of-the-art results for influence maximization (comparing to the greedy approach, previous state-of-the-art) in experiments on synthetic and real-world datasets.
Our method also achieves high diversity in the context of fair influence maximization.
Our work provides a fresh tool that we hope could be helpful to other social network settings where fairness is relevant, such as fair clustering or fair node-level classification.
2 Related Work
Adversarial approaches to fairness were introduced recently [25, 1]. To our knowledge, only a few studies have considered fairness in the context of network diffusion. In this section, we review work on influence maximization, node embedding approaches, and recent investigations of fair influence maximization.
2.1 Influence Maximization
Influence maximization was first introduced as an algorithmic problem by  by proposing a heuristic approach to find an initial set of nodes to maximum the number of further adopters. Over the years, extensive research has focused on cascading behavior, diffusion and spreading of ideas, information or diseases, by identifying a set of initial nodes which maximizes the influence through a network [23, 20, 28, 13].
Identifying individuals who are good initial seeds for spreading of information is studied in two ways: (i) find the set of most central nodes based on network structural properties [22, 20]; or (ii) tackling the problem as discrete optimization [15, 20, 6].
 studied influence maximization under different social contagion models such as Linear Threshold (LT) and Independent Cascade (IC) models. They showed that although finding the optimal solution is NP-hard, submodularity of the influence function can be used to obtain provable approximation guarantees. Since then, there has been a large body of work studying various extensions [15, 11] among which  takes advantage of a network embedding approach.
2.2 Network Embedding
Learning a low-dimensional embedding of the nodes in the network is at the core of our proposed approach. Generally, the network embedding problem proposes to map nodes to a low dimensional space such that the network structure can be reconstructed. Network embeddings have proven their efficiency in classification and clustering problems [18, 10]
, and have attracted much attention from the machine learning and data mining communities[9, 18, 29]. Consequently, several methods have been proposed based on random-walk based models [21, 16]
, deep learning architecture
, and graph neural networks. However, to our knowledge no one has considered network embedding to address the problem of fair influence maximization.
2.3 Contemporary Works
Only a few recent works consider fairness in influence maximization. Recently, several works promote diversity in choosing the seeds [8, 2].  also show that users are sub-optimal in selecting their sources in social media in the sense of receiving diverse information. But these works do not consider a fairness criterion. The notion of individual fairness (similar individuals should be treated similarly) and group fairness (on average, members of different groups are treated similarly) for influence maximization were proposed by  and , respectively.
These two studies are the most related to our work. However, they suffer from a trade-off between the objectives of influence maximization and fairness. This trade-off becomes worse as the number of sensitive attributes increases. Our method achieves state-of-the-art influence maximization results, even with multiple sensitive attributes, while also increasing fairness. We next present how our Adversarial Network Embeddings achieve these desired properties.
In the fair influence maximization problem, we aim to influence the maximum number of nodes, while ensuring that the fraction of influenced nodes is (approximately) equal across predefined groups (e.g. minorities). For ease of exposition, we assume in this section that there are only two groups and , but the results extend naturally to more groups. Let and be the expected total number of influenced nodes belonging to groups and , respectively. We say a spreading process is fair if, by the end of it, we have:
We now provide the intuition and motivation behind our embedding approach. If we pick as our set of initial seeds, let
be the probability that the set of nodeswas influenced by the end of the spreading process. We also write (resp. ) for the set of nodes of which belong to (resp. to ), and the set of all subsets of a set . To compute the expected fraction of influenced nodes in , we sum over all possible disjoint infection patterns:
The probability that all nodes of are infected is given by marginalizing over all subsets of : . Therefore:
Reindexing, and dividing by :
This is where our Adversarial Network Embedding becomes useful. If we knew that for the sets of high probability mass ( the most likely sets of influenced nodes, with of highest values), we had , then we could claim that . Intuitively, by matching the distributions of nodes from and in the embedding space (to have for as many sets as possible), and picking seeds in densely populated areas (so as to put more probability mass where the above equality is verified), we design our framework, which we describe next.
To achieve the desired adversarial network embedding for an input network with two communities, inspired by Generative Adversarial Networks 
, we design an adversarial setting in which the embedder plays against a discriminator. In our setting, the discriminator distinguishes between the embeddings of the two communities. Concurrently the embedder tries to generate embeddings that are indistinguishable by the discriminator. In other words, the discriminator forces the embedder to generate embeddings for the two communities that are coming from distributions which are as similar as possible. We train the embedder and the discriminator with the following coupled loss functions:
refer to the vector representation of nodes in communities, and are their corresponding vector representation in the embedding space. and represent the embedder function and its relative loss function. denotes the reconstruction loss of a standard auto-encoder. Similarly, and refer to the discriminator function and its loss. The discriminator function computes the distance between the distribution of the given embeddings with the initial distribution of the embeddings of nodes of community A. Figure 1 depicts the whole process of our method in a graphical way. Details are shown in Algorithm 1. We can further extend the above setting in order to address fairness in influence maximization with respect to multiple sensitive attributes. In this case, we want our embeddings to have similar
distributions for each value of our different sensitive attributes. To do so, we add one discriminator and one extra term to the embedder loss per sensitive attribute. Then during the adversarial training, the embedder plays against a set of discriminators (one discriminator per attribute). For an instance of two sensitive attributes, the embedder and the discriminator’s loss functions would be as follows:
After reaching a favorable low-dimensional representation of the network where nodes of different communities are distributed similarly in the embedding space, the final step is to choose the proper seeds. We can safely perform the selection in the embedding space since our embedding method is invertible (the original graph can be reconstructed from the embeddings). For the case of one sensitive attribute, our goal is to choose an initial set of influential seeds . We examine the two following approaches:
(i) Normal Selection: This applies a -means method on the space with to select the resulting cluster centroids as initial seeds.
(ii) Fair Selection: In normal selection (above), depending on the network structure, seeds might come from just one community, leading to disparity. To tackle this concern, we propose an alternative method introduced in Algorithm 2. Assume we want to select nodes from as the initially influenced nodes. We start by performing a -means algorithm to group all nodes in into clusters. Then, in each cluster, we select the nearest neighbours to the centroid and determine whether they are members of community () or (). We also divide all the nodes of each cluster into two sub-clusters with nodes belonging to or . Finally, we exploit the -means algorithm on each of these sub-clusters using and respectively and obtain the resulting centroids to have selected seeds from each of the initial clusters giving us seeds from the whole network.
We now show the proposed method’s effectiveness using a small synthetic data set. Figure 1 illustrates the high performance and efficiency of the system in re-creating the network using our model. A standard auto-encoder generates a network where the two communities are completely distinguishable which is representative of completely different distributions. However, with our Adversarial Graph Embeddings, the two distributions mostly overlap - it is hardly possible to separate the two communities. The results are even more striking in Figure 2 which shows the embedding space of our proposed method in comparison with the embedding space of a standard auto-encoder for a large synthetic data set. It can be seen that using our algorithm, nodes from different communities show similar distributions in the embedding space.
In Algorithm 2, the function ClustersCenters() gives the set of centroids when performing -means on the space and ClustersPoints() returns the corresponding cluster of points. Finally, KNN() outputs nearest neighbours of in the space of .
We evaluate the performance of our approach on real and synthetic data.111For code and details of parameter tunings please refer to: https://github.com/Ahmadreza401/fair_influen_max
4.1 Real Dataset
Dataset Description: The real dataset is the Rice University Facebook dataset, collected by , which represents the friendship relations between students of Rice University. The friendship graph is an undirected graph that has 1205 nodes with 42443 links between them. Nodes in this graph contain information about students such as student id, age (which is a number between 18 to 22), and major.
Experimental Setup: A sub-graph of the Rice Facebook dataset is used in the experiments by excluding nodes with a value greater than 20 for their age attribute. We define two communities in the sub-graph. Nodes with a value of 18 or 19 for their age attribute constitute group or , and the rest of the nodes form group , . Group has 97 nodes with 513 intra-connections, while group has 344 nodes with 7441 intra-connections. There are 1779 inter-connections between nodes of the two communities. We assume an activation probability of 0.01 for every link, which is used in the independent cascade model for information propagation.
4.2 Synthetic Dataset
Dataset Description: The synthetic dataset is an undirected graph where each node belongs to either of two groups, and . The size of the groups is set based on a parameter ratio , , where nodes belong to group and () nodes belong to group . Nodes are connected based on intra-group, or , and inter-group () connection probabilities. To connect two nodes that belong to the same group (lets say ), we do a Bernoulli trial with probability and connect the two nodes if the outcome of the trial is 1. Figure 2 shows that the distribution of our Adversarial Graph Embeddings is extremely similar across both groups.
Similarly, two nodes that belong to two different groups are connected only if a Bernoulli trial with probability is successful. As for the real dataset, there is an activation probability associated with every link in the network.
Experimental Setup: In our experiments, we set which gives 150 nodes in and 350 nodes in . We use the same intra-group connection probability for both groups, . To ensure two almost-separated communities, we set the inter-group connection probability to be less than the intra-group probability. Group has 134 intra-group connections, while group has 2843 intra links. There are also 129 inter-group connections between the nodes of the two groups. The activation probability is the same for all links in the network. We tried our method over a range of synthetic data-sets achieving qualitatively similar results. Here we include just the results for the above settings.
This section includes the results for our proposed method along with the results of two state-of-the-art methods, including Greedy , as well as Tsang et al.  over synthetic and real data-sets. Our experiments follow a two-step process. First, a number of seeds are picked from the input network using our proposed method and the baseline methods. For the seed selection, we utilize the Fair Selection approach as it proves to outperform the Normal Selection method. The selected seeds are then passed to an independent cascade model  to compute the final influenced nodes. The results reported show the total fraction of nodes influenced, the fraction of nodes influenced from each community, and the difference between the fractions of nodes influenced from the two communities. Figures 3 and 4 show the results over the real and synthetic detests respectively. In the following, we discuss the technical aspects of the methods that appeared in the results and will explain the results in more details. We conclude the results section by explaining the results for a more general case of two sensitive attributes (Figure 5).
Normal Embedding: Every node of the input network is first described by its row in the adjacency matrix.
The Normal Embedding method learns a low-dimensional representation for each node using an auto-encoder. The auto-encoder ensures that the low-dimensional representation is informative enough to reconstruct the adjacency list by imposing a cross-entropy loss that penalizes the model for any dissimilarity between the main and reconstructed adjacency list.
After getting the embedding, we cluster the points in the embedding space to four clusters (We tested for different numbers of clusters and observed that considering 4 clusters performs best in terms of fairness) and pick the seeds that are nearest to the center of the clusters. This strategy helps to pick nodes with different properties that are likely to influence a good fraction of nodes in the network. Results shows that Normal Embedding influences a comparable fraction of the nodes in both the real (Figure 3(a)) and synthetic (Figure 4(a)) data-sets which is close to the fraction influenced by the Greedy method (state-of-the-art classical IM solutions). However, observe that both Normal Embedding and Greedy are very biased: influencing more nodes from the community of bigger size. In both data-sets, group B is larger and as we see (Figure 3(b, d) and Figure 4(b, d)), the majority of nodes influenced by the Normal Embedding and the Greedy methods belong to group B.
Fair Embedding: To address fairness concerns in Normal Embedding, we adversarially train the auto-encoder with a discriminator. Before adversarial training, the discriminator is pre-trained to be able to discriminate between embeddings of the two communities. Through the adversarial training, the auto-encoder tries to learn an embedding for nodes of different groups that are indistinguishable by the discriminator. Using this method, the embedding for nodes of group A will have similar
distribution to that of group B.
The adversarial training is followed by a selection algorithm that picks the seeds from the adversarially trained embedding. The selection algorithms are discussed in full details in the methodology section.
The results show that the Fair Embedding method gives a boost to the fraction of nodes influenced from the group of smaller size in both real and synthetic data sets (Fig 3(c, d) and Fig 4(c, d)). This happens while the total fraction of nodes influenced remains similar (Fig3(a) and Fig4(a)), the Fair Embedding only marginally reduces the fraction of nodes influenced from the larger group (See Fig3(b) and Fig4(b)). Overall, the fractions of nodes influenced from the two communities are close to one another (Fig3(c,d) and Fig4(c,d)), improving fairness compared to the Normal Embedding and Greedy methods. More importantly, in the real dataset, our method significantly outperforms baselines in both maximizing the numbers of people reached and reducing disparity (Figure 3). In synthetic dataset, it seems that our results are very competitive compared to Tsang et al. in terms of fairness , however we outperform Tsang et al. in terms of the total number of people influenced, and the fraction of people influenced from both and communities.
Two sensitive attributes: Figure 4(a) shows the results for the case of two sensitive attributes. As shown, our method outperforms Greedy in terms of fairness concerning both sensitive attributes. Our method also shows a comparable performance with Tsang et al. in terms of fairness, however, it is more efficient in terms of running time (see Figure 4(b)).
In this paper, we proposed a new method to tackle the problem of fair influence maximization. We observed that existing algorithmic methods to address influence maximization in the population usually lead to considerable disparity. While there usually exists a trade-off between fairness objectives and maximization objectives proposed, our proposed approach achieves both: experimental results over synthetic and real-world data-sets demonstrate that our method is competitive with the prior state-of-the-art for maximizing the total number of influenced people, while also significantly decreasing disparity.
While we demonstrated the effectiveness of our Adversarial Network Embeddings in the context of fair influence maximization, we believe our approach could be used in future work to tackle other fairness tasks over social networks such as fair clustering or fair node-level classification.
AW acknowledges support from the David MacKay Newton Research Fellowship at Darwin College, The Alan Turing Institute under EPSRC grant EP/N510129/1 & TU/B/000074, and the Leverhulme Trust via the CFI.
One-network adversarial fairness.
Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 2412–2420. Cited by: §2.
Learning optimal and fair decision trees for non-discriminative decision-making. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 1418–1426. Cited by: §2.3.
-  (2019) On the fairness of time-critical influence maximization in social networks. ArXiv abs/1905.06618. Cited by: §1.
-  (2016) On the efficiency of the information networks in social media. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 83–92. Cited by: §2.3.
-  (2018) Purple feed: identifying high consensus news posts on social media. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 10–16. Cited by: §1.
-  (2013) Revenue maximization in social networks through discounting. Social Network Analysis and Mining 3 (4), pp. 1249–1262. Cited by: §1, §2.1.
-  (2013) The diffusion of microfinance. Science 341 (6144). Cited by: §1.
-  (2018) Diversity constraints in public housing allocation. In AAMAS, pp. 973–981. Cited by: §2.3.
-  (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering 30 (9), pp. 1616–1637. Cited by: §2.2.
-  (2015) Grarep: learning graph representations with global structural information. In Proceedings of the 24th ACM international on conference on information and knowledge management, pp. 891–900. Cited by: §2.2.
-  (2007) Maximizing influence in a competitive social network: a follower’s perspective. In EC, pp. 351–360. Cited by: §2.1.
-  (2019) Gaps in Information Access in Social Networks. In WWW, Cited by: §1, §2.3.
-  (2010) Inferring Networks of Diffusion and Influence. In KDD, Cited by: §2.1.
-  (2014) Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: §3.
-  (2013) On minimizing budget and time in influence propagation over social networks. Social network analysis and mining 3 (2), pp. 179–192. Cited by: §2.1, §2.1.
-  (2016) Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864. Cited by: §2.2.
-  (2017) Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, pp. 1024–1034. Cited by: §2.2.
-  (2017) Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584. Cited by: §2.2.
-  (2020) Influence maximization across heterogeneous interconnected networks based on deep learning. Expert Systems with Applications 140, pp. 112905. Cited by: §2.1.
-  (2003) Maximizing the spread of influence through a social network. In KDD, Cited by: §1, §2.1, §2.1, §2.1, §5.
-  (2019) SimNet: similarity-based network embeddings with mean commute time. PloS one 14 (8), pp. e0221172. Cited by: §2.2.
-  (2013) Identifying high betweenness centrality nodes in large social networks. Social Network Analysis and Mining 3 (4), pp. 899–914. Cited by: §1, §2.1.
-  (2007) Cost-effective outbreak detection in networks. In KDD, pp. 420–429. Cited by: §2.1.
-  (2018) Influence maximization on social graphs: a survey. TKDE 30 (10), pp. 1852–1872. Cited by: §1.
-  (2018) Learning adversarially fair and transferable representations. In International Conference on Machine Learning, pp. 3384–3393. Cited by: §2.
-  (2012) Immunizing complex networks with limited budget. EPL (Europhysics Letters) 98 (3), pp. 38004. Cited by: §1.
-  (2010) You are who you know: inferring user profiles in online social networks. In WSDM, pp. 251–260. Cited by: §4.1.
-  (2002) Mining knowledge-sharing sites for viral marketing. In KDD, Cited by: §1, §2.1.
-  (2015) Line: large-scale information network embedding. In Proceedings of the 24th international conference on world wide web, pp. 1067–1077. Cited by: §2.2.
-  (2019) Group-Fairness in Influence Maximization. arXiv preprint arXiv:1903.00967. Cited by: §1, §2.3, §5.
-  (2016) Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1225–1234. Cited by: §2.2.
-  (2012) Exploring social influence for recommendation: a generative model approach. In SIGIR, pp. 671–680. Cited by: §1.