1 Introduction
Growing popularity of social networks like Instagram, Facebook, and their ability to propagate ideas and information so rapidly has made the Influence Maximization (IM) an eyecatching task. The IM problem in its classical form is to identify the top influential nodes of a social network that can influence (directly or indirectly) the largest number of nodes in the network [1]. Viral marketing is a proper application of IM. A company selects a seed set of customers and activate them using free products, hoping that product adoption would propagate in the network through the word of mouth effect [2].
Kempe et al. [1] were the first to formulate the IM problem as a discrete stochastic optimization task and presented Independent Cascade (IC) and Linear Threshold (LT) models for influence diffusion. These stochastic diffusion models define how influence starts from the initial nodes and spreads through the network. They proved that under these two models, the IM problem is NPhard and proposed a greedy seed selection algorithm that guarantees the approximation of the optimal solution.
In the real world, it is prevalent that two or more companies are competing in the same market, which means that the classical IM problem lacks a competitive extension [3]. In the Competitive Influence Maximization (CIM), several different companies are propagating their products or ideas simultaneously in the network. Also, the goal of each company is to activate the largest number of nodes and defeat the opponents. Based on the assumptions and scenarios considered in previous studies, they can be categorized into two subcategories:

Studies that focus on the problem of which nodes from the network to select considering the existence of other opponents. In these studies, a competitive extension to the IC or LT models is used for diffusion dynamics. Among these studies, two scenarios are very famous. First, the follower’s perspective scenario [3, 4, 5, 6, 7], which supposes that seed selection happens in turn, and solves the problem by considering itself as the last player to commit in seed selection. So, it takes advantage of knowing the other opponents’ seed sets. Second, the scenario that considers that seed selection happens simultaneously, and the main objective is to propose a framework for selecting the best method from a set of seed selection algorithms available [8, 9, 10, 11].

Studies that address the budget allocation scenario [12, 13, 14, 15]. In this scenario, companies can allocate different values of their budget to a node in the network, and nodes will prefer the product of the company that has offered a higher value. So, here, companies compete with each other by the amount of budget they allocate to each node in the network. The common assumption in all these studies is that the amount of budget allocated to a node is a real value. Also, the voter model is applied for influence dynamics.
The seed selection problem (the first category) has long been studied in the literature. However, only a few works have analyzed the budget allocation scenario (the second category). We started our work by conducting an indepth analysis of the seed selection studies (Section 4). We observed that among these studies, the scenario that seed selection happens at the same time is more realistic. However, even this scenario is still incomplete. As we will discuss, in this scenario, the most promising choice for parties is one of the effective noncompetitive seed selection methods, and these methods largely agree on which nodes from the network to select. Therefore, the result of this scenario would be the state that parties are targeting the same nodes from the network, which shows that this modeling is still incomplete.
The studies in the budget allocation scenario also have the defect that the considered action space is continuous, meaning that the amount of the budget allocated to a node is a real value. In reality, we often see offers like 10%, 15% of discount. Offers like 12.22% and 12.77% are very rare. Also, a common realization of viral marketing is the example that companies offer free products with different values to influential customers. In this case, the company must decide which product to offer to which influential customer, and again the action space is discrete. These observations indicate that the value of the budget allocated to a node must be discrete instead of being a real value. Furthermore, when the action space is continuous, the whole network can be targetted by allocating a very small fraction of the budget to each node in the network. However, again, this consideration is far from reality.
The observations from analyzing the seed selection studies and also the fact that the action space in the budget allocation scenario tends to be discrete, motivated us to integrate seed selection and budget allocation to propose a new scenario for the CIM problem. In our proposed scenario, the parties first identify the influential nodes of the network. Then, they compete over only these nodes, not the whole network, by the value of the budget they allocate to each influential node. Also, the considered action space is discrete. This modeling of the problem improves the discussed issues with previous studies. Figure 1 demonstrates a simple example of the proposed scenario. Both parties first recognize that nodes 1, 2, and 3 are the most influential nodes in the given social network. They have three units of budget and compete with each other over these influential nodes. Player 1 decides to allocate one unit of budget (a package that costs one unit of budget) to each of these influential nodes. However, player 2 decides to allocate two units to node 1 and one unit to node 2. In this case, node 1, for example, will prefer player 2 as he has suggested a package with a higher value.
We use game theory to analyze the scenario and calculate the Nash equilibrium by the Double Oracle algorithm [16]. This algorithm is specially designed for twoplayer zerosum games with large action spaces like ours. For the diffusion dynamics, we have applied the competitive Independent Cascade (competitiveIC) model. Estimating the payoff in this diffusion model is very challenging. It is proven that calculating the influence spread under the IC model is #Phard [17]. To tackle this difficulty, we have designed a novel method that assigns an influential value to nodes based on the concept of Reverse Reachable (RR) sets [18]. Also, our method is applicable in LT and Trigerring based diffusion models, since the concept of RRsets is generalizable to these models [19]. To list our contributions:

We propose the twophase budget allocation scenario where seed selection and budget allocation are integrated. This scenario improves previous works on seed selection by changing the focus of the problem to convincing influential nodes to act as seed rather than which nodes from the network to select (we have a strong motivation for this change presented in Section 4). Also, the budget allocation scenario is enhanced in the following ways: 1. The focus of the competition is reduced to only influential nodes instead of the whole network. 2. The action space is considered to be discrete. 3. The scenario is extended to the IC, LT, and Triggering based diffusion models.

We develop a novel competitive aware budget allocation framework for the CIM problem. We especially design a new payoff estimation method and a best response oracle that significantly improves the efficiency of our framework.
This paper is organized as follows: Section 2 reviews the related work. In Section 3, some preliminaries are described. In Section 4, our discussion on seed selection studies is presented, and we discuss why these studies are unrealistic. The proposed scenario and the proposed framework will be described in Section 5 and Section 6. Experiments are reported in Section 7, and finally, we conclude our work in Section 8.
2 Related Work
Carnes et al. [4] and Bharathi et al. [3] are among the firsts that have addressed the Competitive Influence Maximization (CIM) problem. They first provide extensions to the Independent Cascade (IC) model for the competitive environment, and then analyzed the problem given the opponent’s seed set, the follower’s perspective. They similarly showed that under their diffusion models, the greedy algorithm provides the approximation for the second player (the last player). Bharathi et al. further studied the best response strategy for the first player in very simple cases, like when the graph is directed lines.
Lin et al. [8] considered the scenario that seed selection happens in a multiround manner and proposed a framework based on Qlearning to learn the optimal strategy for the parties. In their framework, actions were considered as singleparty influence maximization algorithms. Also, diffusion happens according to the competitive Linear Threshold (competitiveLT) model, and the reward was calculated in the last round by difference of the number of activated nodes by parties. Their framework is applicable when the opponents’ strategy is known or unknown. Li et al. [9] considered a more realistic scenario where parties select their seed sets simultaneously. They modeled the problem as a game with seed selection algorithms as actions and the expected influence as payoffs. Also, diffusion happens according to the competitiveIC model. These two works do not propose a new strategy for seed selection. Instead, a framework for selecting the best strategy from a set of seed selection algorithms available is proposed. Their solution is more practical since they do not consider any information about the seed set of the opponent.
In [5], a new model based on the competitiveLT model where each node can think about the incoming influence, and an influence maximization algorithm based on community detection were proposed. In [6], time limitation and time delay in the propagation process have been taken into account, and the competitiveIC model with meeting events was proposed. In [7], given the opponents’ seed sets, they have tried to select the seed set that its spread will pass a certain threshold with a minimum cost. All these works have solved the problem with the assumption that the opponent’s seed set is given (from the follower’s perspective).
In [20], a twoplayer zerosum extensiveform game is proposed that simulates the idea of the CIM problem. The players have an equal number of tokens, and in each stage of the game, they should choose one node from the graph to put a token on. When the number of tokens on a node reaches a threshold, the node will become activated and spreads its tokens to the neighbors. They have used AlphaBeta pruning and Monte Carlo tree search algorithms to find the optimal strategy for the players.
In [10] and [11], the Qlearning based framework proposed in [8] is improved in the following way. In [10]
, transfer learning is applied to avoid retraining the Qlearning method when the social network changes. In
[11], the factor of time is integrated into the competition, and a nested Qlearning algorithm is proposed.Masucci and Silva [12] are the first to introduce the budget allocation scenario. They analyzed the case of two competitors and used game theory, especially Colonel Blotto games, to calculate the Nash equilibrium. They also used the voter model with discrete states for their diffusion dynamics. Later in [13], they extended the problem to the case of more than two players and presented the voter model with ranking scores. In both works, the considered action space was continuous, meaning that any fraction of the budget could be allocated to a node. Also, all players had an equal amount of budget.
Varma et al. [14] extended the voter model to the case where nodes have continuous states representing their opinion toward the players. Each player’s objective is to maximize the overall opinion in the network. They mainly considered the case of two players, and again, the action space was continuous, and also the players’ amount of budget could be different. They showed that under their proposed model, the game has a pure Nash equilibrium. In [15], they further extended their work to the case that marketing campaigns are repeated.
3 Preliminaries
3.1 Competitive Independent Cascade Model
The competitive Independent Cascade (competitiveIC) model proposed in [3] is a widely accepted diffusion model for the competitive environment. A directed graph
is given and an activation probability
is assigned to each edge. Nodes have two states, they are either or . In the latter case, denotes the player, for whom the node is activated. Consider that denotes the seed set of the player . Given the seed sets, influence propagates in the network through a progressive process as follows:
In the first step, nodes in each set become activated and their states are set to .

In step , in a random order, nodes activated in step will try to activate their free neighbors. The activation succeeds with probability , and newly activated nodes take the state of the node who has activated them.

The process terminates when no new activation occurs.
According to the process, each newly activated node has only a single chance to activate its free neighbors, and also active nodes never change state. The influence spread of player , , denotes the number of nodes activated by when diffusion terminates.
3.2 Reverse Reachable Sets
The Reverse Reachable (RR) sets [18] is a proven method to approximate the influence spread and avoid the limitations of the greedy algorithm. In this section, we first describe this concept in the noncompetitive environment, then it is described how this method is extended to the competitive problem when the opponent’s seed set is given.
3.2.1 RRsets in Noncompetitive Problem
This is the definition of the random RRsets under the classical IC model [18, 19]:
Let be a graph in which diffusion is deterministic, meaning that, if there is a path from a node to a node , then can influence . An RRset for a node in the deterministic graph is a set that includes all the nodes in that has a path to . is called the root node. A random RRset is an RRset created for a root node selected randomly, also the deterministic graph is sampled by removing each edge of with probability (the root node and deterministic graph are both sampled randomly).
Intuitively, the nodes in an RRset have a chance to activate the root node of the RRset, if they are selected as the seed set. Algorithm 1 describes how an RR set is created algorithmically. First, the RRset and a queue are initialized with the root node selected uniformly at random (lines 1 and 2). Then, we start to move backward from the root node and add each visiting node to the RRset according to the probability assigned to the corresponding edge. The process of moving backward is repeated for each node newly added (lines 3 to 6) [19].
RRset based methods like TIM/TIM+ [19] and IMM [21] build their seed selection algorithm upon this idea that when a sufficient number of random RRsets are created, the expected spread of any seed set can be estimated by the fraction of the RRsets covered by the seed set (A seed set covers an RR set if it has a node in that):
(1) 
is the number of nodes, is the number of RRsets created, and is the number of RRsets covered by the seed set [19, 21]. Algorithm 2 describes the outline of the RRset based seed selection methods. First, a sufficient number of random RRsets are created (line 1). Then, iteratively, the seed set is filled with the node that covers the largest number of the RRsets (lines 3 to 6). Note that in each iteration, the RRsets covered by the selected node are removed (line 6).
3.2.2 RRsets in Competitive Problem
Lu et al. [22] extended the definition of RRsets to the general case where two items that complement or compete with each other are diffused in the network. By this extension, they proposed the generalTIM algorithm for seed selection with the assumption that the other player’s seed set is given. They proved that, in specific conditions, their algorithm provides the approximation with a high probability. They further show that in the pure competition (our problem) their conditions completely hold.
According to their general definition of RRsets, for creating a random RRset for the competitive problem (under the competitiveIC model), the process is entirely the same as the noncompetitive process, except that the backward phase stops when a node from the opponent’s seed set is visited. We refer to the generalTIM algorithm with competitive RRsets as the competitiveTIM algorithm.
4 Discussion on Seed Selection Scenarios
In this section, we conduct a step by step investigation regarding the scenarios where the focus of the competition is on seed selection, and discuss why these modelings are not realistic. For simplicity, our analysis is mainly around the case of two players, and the competitiveIC model is used for diffusion.
First, consider the follower’s perspective scenario. In this scenario, the second player selects his seed set after the first player, and as a result, has complete knowledge of the opponent’s seed set. For this player, the greedy algorithm provides a solution with the approximation of the best response [3]. As described in Section 3.2.2, the competitiveTIM provides the same approximation with a high probability and is much more efficient than the greedy method [22]. Therefore, competitiveTIM is a perfect choice for the second player. Now consider the first player. As far as we know, no practical choice has been proposed for the first player, and we should refer to the noncompetitive seed selection algorithms. There are generally two categories of seed selection strategies in the noncompetitive literature. First, methods that do not guarantee any approximation. Second, algorithms with an approximation guarantee , like CELF [23], CELF++ [24], TIM+ [19] and IMM [21]. It makes much more sense to consider that the first player selects his seed set by one of the algorithms from the second category.
We conduct an experiment where the first player adopts the TIM+ algorithm (a representative from the second category), and the second player uses the competitiveTIM. Also, for both players, the seed set size is equal to (). The diffusion happens according to the competitiveIC model and is repeated 5000 times to get an exact estimate of . Note that denotes the number of nodes activated by . Results are reported in Figure 2 ^{1}^{1}1CAHepTh, Facebook, p2p, and WikiVote are the four datasets used in our experiments. Their statistics are described in Section 7.1. When the value of is positive, the first player has caused more influence spread and won the game. Similarly, when this value is negative, the second player has won the game. Results show that the second player has lost the game in 3 out of 4 datasets, although he had complete knowledge of the opponent’s seed set. Therefore, it seems that the first player has more chance to win, and the considered scenario is biased.
Now, consider the scenario where the objective is to propose a framework for selecting the best seed selection strategy. To analyze this case, we consider the scenario proposed in [9]. In this scenario, seed selection happens at the same time, and also seed sets can overlap. Overlapping nodes will be assigned to one of the players randomly. Considering the previous discussion, and also the fact that there is no competitive seed selection algorithm specific to this case, we would get again to the noncompetitive seed selection methods with the greedy approximation ratio as the best choices. Note that the main difference among these methods is their running time, not their effectivity. Therefore, it is sensible that both players use IMM [21], which is the most efficient and scalable algorithm from this category, and so, their seed sets would be entirely the same. Even if the players adopt different methods from this category that has an approximation guarantee, again they will almost select the same nodes. To show this, we have calculated the number of nodes that overlap between the seed sets selected by CELF [23], CELF++ [24], TIM+ [19] and IMM [21] algorithms and report the result in Table 1 ^{2}^{2}2For CELF/CELF++ we have used the code in https://github.com/jjboo/InfMax, and for TIM+ the code in https://sourceforge.net/projects/timplus/, and for IMM the code in https://sourceforge.net/projects/imimm/. Also the parameters of these experiments are: 20000 MC simulations in CELF/CELF++ and in TIM+/IMM.. The number of overlappings will even increase if we reduce the approximation error by increasing the number of Monte Carlo simulations in CELF and CELF++ and decrease in TIM+ and IMM. Therefore, the result of this scenario is the state that the players are targeting the same initial adopters. In conclusion, this observation indicates that the competition is beyond the choice of the seed selection method.
5 Proposed Scenario
Suppose that different companies are competing for more product sales in a social network. Each company has a marketing budget and several advertising packages with different values. The companies first identify the most influential nodes of the network, and then they decide which package should be allocated to which influential node. Nodes will prefer the package with the highest value.
Now we formally define the scenario. Consider that the set with size denotes the set of most influential nodes. Also, suppose that player has a marketing budget and his available advertising packages are , where denotes the value of the package, or equivalently the cost of the package for the company. By this formulation, the actions available to player would be , where indicates the value of the package allocated to the th node in , and also . If , it means that the player has ignored the node. Players simultaneously select their actions. Each node will be assigned to the seed set of the player that has offered the node a package with higher value compared to the other players. If two or more players have allocated the highest value, the node will be assigned to one of them randomly. When players play their actions (at the same time), their seed sets, , can be identified, and then diffusion happens according to the competitiveIC model described in Section 3.1.
As an example, consider Figure 1 part b. In this example, both players have an equal amount of budget, . Also, they have packages with similar values, . The set of most influential nodes is . The first player has adopted action , and the second player has adopted action . Therefore, node will be assigned to the second player (who has offered a higher value) and node to the first player. Also, node would be assigned to one of the players randomly, as both players have allocated a package with equal value to this node. Suppose that the result is that this node is assigned to the second player. Finally, the seed sets would be and . Then, diffusion happens, and the winner of the game would be identified, which is player 2 in this example.
6 Proposed Framework
We model the scenario as a normal form game and propose a novel framework for the case of two players. As the first step, top influential nodes of the network (the nodes in set , ) are identified using the TIM+ algorithm described in Section 3.2.1. This is the first phase of our framework. We define the utility (payoff) for the first player as , where and are the number of nodes activated by the first player’s seed set () and the second player’s seed set (). Respectively, the second player’s payoff would be . In this way, we have formed a zerosum game, . Zerossum games model the case of pure competition where any gain by one player results in the same amount of loss for the other player.
In a multiagent environment, players can follow two kinds of strategies. Selecting one single action and playing it, pure strategy, or playing according to a probability distribution assigned to actions, mixed strategy. The pure strategy is a mixed strategy where probability one is assigned to a specific action. The most common and prominent solution concept in normalform games is the Nash equilibrium. Nash equilibrium is a stable point where each player is playing the best response against its competitor. Therefore, no rational player would like to deviate from the strategy playing in Nash even if he gets to know the opponent’s strategy. Every finite game has at least one Nash equilibrium
[25].Consider that the value of the game is defined as the first player’s payoff in equilibrium (the state where both players are playing their respective Nash strategy). According to the Minimax theorem [25], in zerosum games, the Nash strategy guarantees the player playing this strategy a minimum amount of payoff, which is equal to the value of the game. Note that when both players have an equal amount of budget, , and also both have packages with the same values, , the game would be symmetric. In symmetric games, the value of the game is zero [26]. Thus, in this case, the Nash strategy guarantees a payoff of at least zero, or in other words, guarantees not to lose any competition. This is a very promising guarantee.
To calculate the Nash equilibrium, we faced two challenging difficulties:

Our action space is considerably large. As an example, for the parameters , and , size of the action space is 1,761,039,350,070.

Considering the size of our action space, an efficient method for approximating the value of payoff, , is required. The widely adopted solution of directly diffusing the seed sets in the network [9, 8, 10, 11] is not applicable here. Since, in this method, to reach an acceptable approximation, the diffusion must be repeated many times, which takes a long time.
To tackle the first difficulty, we have used the Double Oracle algorithm proposed in [16] for twoplayer zerosum games. The general idea of the method is that it considers a restricted game, where only a subset of the actions is available to each player. Then, the action space of the restricted game is iteratively enlarged until the algorithm converges. Algorithm 3 describes the outline of the Double Oracle algorithm. The available action sets are initialized with an arbitrary action from the action space (line 1). In each iteration, the Nash equilibrium of the restricted game is calculated (line 3). Using a best response oracle, the best response in pure strategies against the calculated equilibriums are identified and added to the action sets (lines 48). The algorithm converges when the calculated best responses are already in the action sets: both and are in and . In this case, and are the exact equilibrium of the game with the whole action sets. Another termination condition is to continue until , in this way, an approximation is calculated [16]. The algorithm guarantees to converge, and practically since only best responses are added in each iteration, it converges much faster than directly calculating the Nash equilibrium [16].
The Double Oracle algorithm works best in the problems that an efficient oracle for calculating the pure best response exists. To propose a very fast best response oracle and solve the problem of payoff estimation, we reuse the RRsets created in the first phase and develop a heuristic method that assigns an influential value to the nodes in
. Then, we approximate the value of payoff using these calculated values. Suppose that the adopted actions by the first player and the second player are respectively , and , the payoff for the first player would be:(2) 
where the function denotes the gain by node in the first player’s payoff and is defined as:
(3) 
The above equation comes directly from the scenario and is common in other budget allocation studies [12, 13]. The assignments in function is almost clear. When the player wins the node by assigning a higher value so his payoff would increase by , and when he loses the node, his payoff will decrease. Also, when both players ignore the node (), the gain from this node for both players would be zero. In addition, when again, the gain would be zero. Since, according to the scenario, in this case, the node will be assigned to one of the players randomly, and as our game is zerosum, the expected gain is zero for both players. Calculation of is straight forward.
Note that the Nash strategy guaranteed a minimum amount of payoff, which is zero in the symmetric case. However, we are not calculating the exact value of payoff and the error of our heuristic influential value estimation method decreases the guaranteed value. Our experiments will show that this error is too small to affect the performance of the framework. Also, when the second termination condition is used in the double oracle algorithm, again, the guaranteed payoff would decrease by the value of the parameter.
In summary, our framework works in the following steps:

The set of most influential nodes, , is identified by TIM+.

The influential values of the nodes in are estimated.

The Nash equilibrium is calculated using the Double Oracle algorithm.
Next we will describe our heuritic influential value estimation method and the best response oracle.
6.1 Influential Value Estimation Method
Here we describe how we have reused the RRsets created by the TIM+ algorithm in the first phase of our framework to assign an influential value to the nodes in . In the TIM+ algorithm, random RRsets are created according to the noncompetitive definition of RRsets, definition 1. Consider that denotes this set of RRsets. Recall from Section 3.2.1, that an RRset is a set of nodes that have a chance to activate the root node of the RRset. Also, the influence spread of any seed set can be estimated by , where is the number of nodes in the network and is the number of RRsets that the seed set has at least a node in them (refer to Equation 1).
Now, we describe our extension to the RRset based methods. A node covers an RRset if it is a member of the RRset, and so has a chance to activate the root node of the set. In our problem, each node can have three states. It is either in (belongs to the first player), or in (belongs to the second player), or it is ignored by the players and no budget is allocated to it (it is like the nodes out of ). Considering these cases, the effect of covering an RRset that is also covered by some other nodes in would not be equal to covering an RRset that is just covered by a single node. This is the main difference between our problem and previous problems. To apply this difference, we assign a weight to each RRset : , where indicates how many nodes from cover (or equivalently, number of the nodes in that are also in ). Suppose that returns 1 when is true and 0 otherwise. We estimate the influential value of a node by the sum of the weights of the RRsets it covers:
(4) 
Note that when the whole belongs to one player, for example, when and , our method’s estimated value of the expected spread is exactly the same as the value estimated by noncompetitive RRset based methods (Section 3.2.1). Also, the competitive RRset based method (described in Section 3.2.2) is not applicable here, since they estimate the spread given the other player’s seed set. However, we do not have any information about the other player’s seed set.
6.2 Best Response Oracle
The oracle calculates the best response in pure strategies against a given mixed strategy, . The mixed strategy can be represented by the set of actions that probability greater than zero is assigned to them (support of the mixed strategy): , where is the probability that the player would play , and are the actions in the support of . Without loss of generality, we assume that the given mixed strategy is played by the second player, and we, as the first player, are searching for the best response. So, in this section, the notation of indicates the payoff for the first player. Also, we use to denote the budget and to denote the set of the advertising packages of the player for whom we are trying to find the best response.
Payoff of adopting a pure strategy against the given mixed strategy would be:
(5) 
where is the payoff of against the pure strategy (action) in the support of . So, the oracle searches in pure strategies for the strategy , that:
(6) 
Now we describe the algorithm to find the pure best response . By our definition of actions, and each in the support of is , where each , is the value of the package allocated to the th node in under and . According to Equation 5 and Equation 2, the payoff of pure best response gainst the given mixed strategy would be:
(7) 
and if we define the function in the following way:
(8) 
Then, the payoff can be calculated based on the value of each :
(9) 
The function , calculates the total gain or loss of allocating a package with value to the node . By this formulation, we can easily use dynamic programming to select the optimal value of each from the set considering the constraint that . Furthermore, there is no need to explore the whole values of for each , since, the value of function would not change for most of the values in . In this way, we can largely reduce the search space. To clear the explanation, consider the following example.
Consider that we are trying to find the best response for the player that his budget is , and his set of available packages is . Also, assume that (, number of the influential nodes identified in the first phase of our framework). The given mixed startegy is (with probability 0.8, would be played, and with probability 0.2, would be played). And, , . The estimated influential values of the nodes in are respectively . So, the problem can perfectly be described in the following way:
Here the gain of allocating a package with value 3 to is . In the same way, and . So the effect of and are the same, thus should not be explored as it is just wasting the budget. If we continue the calculations and use dynamic programming, we would get to and (the maximum possible payoff).
7 Experiments
In this section, through exhaustive experiments, we separately evaluate the performance of the payoff estimation method and the proposed framework.
The framework and experiments are implemented in python language using networkx library ^{3}^{3}3https://networkx.github.io/ (version 1.11), and Nash equilibrium in the restricted game (line 3 of double oracle algorithm, algorithm 3) is calculated using gambit framework ^{4}^{4}4http://www.gambitproject.org/ (version 16). Experiments are conducted on a machine with Intel Core i5 3.4 GHz 4 cores 64bit processor and 32 GB memory.
The source code of all the implementations are available in the project’s repository ^{5}^{5}5https://github.com/ahansari/cim.
7.1 Experiment Setup
Our experiments are conducted on the standard datasets used in the literature of influence maximization. Datasets and their statistics are listed in Table 2. CaHepTh is the collaborative network from Arxiv. EgoFacebook consists of the circles from Facebook. Gnutella consists of the Gnutella peertopeer filesharing network. WikiVote represents who has voted whom in the administrator selection process in Wikipedia. Datasets are available in the Stanford Large Network Dataset Collection website ^{6}^{6}6https://snap.stanford.edu/data/index.html.
For the competitiveIC model, the propagation probability of each edge is considered as , where is the node that points to. This setting is widely adopted in previous studies [19, 21, 8]. Further parameters of each evaluation are described in their own subsection.
Name  Type  Node Number  Edge Number 

CaHepTh  Undirected  9877  25998 
EgoFacebook  Undirected  4039  88234 
p2pGnutella08  Directed  6301  20777 
WikiVote  Directed  7115  103689 
7.2 Evaluation of the Influential Value Estimation Method
The objective of the proposed heuristic method was to assign an influential value to the nodes in and then calculate the payoff based on this value. There exists no similar method in the literature that calculates the payoff in this way (by assigning a value to the nodes), so, we could only validate our method against the followings:

Simple RRset based method: This is the method that we have extended. In this method, the influential value of each node is calculated by estimating the influence spread of the singleton set using the noncompetitive RRset based method (refer to Section 3.2.1): .

Centralitybased heuristics: Centralitybased measures like Degree, Betweenness, Closeness, etc. can also be used to assign an influential value to the nodes in . These methods are used to benchmark the effectivity of the proposed method and are all available in the networkx library.
The evaluation is conducted in this way that, first, the set with size is identified. Then, 20 experiments are executed for each dataset. In each experiment, the nodes in are randomly assigned to either , or , or none of the sets. This is to simulate different states that nodes in might take in the competition. After these random assignments, the actual value of is calculated by the average of in 5000 rounds of repeated diffusion in the network according to the competitiveIC model. The absolute error would be the absolute difference between the actual value and the value estimated using the valuebased methods. Finally, the mean absolute error of these 20 experiments for each method is reported in Table 3.
According to the results, our method has significantly performed better than the SimpleRRset method; this proves that our extension is valid. Also, the heuristic centralitybased measures have all failed. An abnormal pattern in the results is the significant failure of the simpleRRset method under the p2p dataset (mean absolute error of 626.42). This is because this method does not consider the fact that the created RR sets may be covered by more than one node from the set , and this effect is more considerable under the p2p dataset compared to other datasets.
Method  CAHepTh  p2p  WikiVote  

OurMethod  6.33  15.91  83.76  9.13 
SimpleRRsets  20.27  55.70  626.42  23.71 
Degree  86.90  144.89  302.78  60.49 
Betweenness  86.85  144.90  302.78  60.70 
Closeness  86.18  144.61  302.20  60.13 
Eigenvector  86.89  145.34  302.74  60.69 
Katz  82.71  142.49  298.60  56.69 
CoreNumber  74.11  990.09  284.54  188.43 
PageRank  86.93  145.38  302.79  60.73 
7.3 Evaluation of the Proposed Framework
In this section, we conduct several experiments to show the effectivity of our proposed framework. We mainly focus on the symmetric case where and the advertising packages available to the players are . This is because this case is more challenging. Winning the game in the other case, where one of the players has a larger budget, is not too difficult. Also, no similar framework exists in the literature to be compared with our method, and so, we have used the following strategies in our experiments:

1each: One unit of the budget is allocated to the nodes identified by TIM+.

2each: Two units of the budget are allocated to the nodes identified by TIM+.

Random: Considering the nodes, this strategy starts from the first node and allocates a random value greater than zero to each node with the considerataion to keep .

Random(): Like the above strategy, but it does not allocate a value greater than to a node (the maximum amount of budget allocated to a node is ).
The 1each strategy resembles the studies that focused on the seed selection scenario, where the budget only identifies the seed set size. The Random strategy is an agent with no rationality, and Random() is an improved version of Random.
As the experimental parameters, we have tested the values of . For , the size of the set , the values of are tested to see which parameter would perform better. Note that it is not wise to set since, in this way, the framework cannot investigate the case that the maximum possible number of nodes is selected. The double oracle algorithm is initialized (line 1 of Algorithm 3) with 1each strategy to make the results reproducible. Also, the complete convergence is used as the termination condition in . However, in , due to long running time, for all datasets except p2p, we have used and for p2p we have used as termination condition. In the results, denotes our proposed framework, Competitive Budget Allocation (CBA) framework, where .
7.3.1 General Experiments
As the first step, we benchmark the performance of our proposed framework (CBA), 1each, and 2each strategies by performing competitions between these strategies (as the first player) against the two Random, Random(3) strategies. Each competition is repeated 1000 times, and the win percentage of these methods is reported in Table 4. Note that, in the Influence Maximization game, a win is when . For each dataset and value of the budget, the method that has achieved the highest win percentage is highlighted. For example, in CAHepTh dataset with , 1each strategy has defeated Random strategy in 86.4% of the competitions and has achieved the highest win percentage.
The Random(3) strategy is just the Random strategy improved by integrating a little bit of knowledge (to not allocate a value higher than 3). Note how the performance of the methods drops when tested against this strategy. In CAHepTh dataset with , the performance of 1each drops from 86.4% to 54.4%. Also, mostly the 1each and 2each strategies have achieved the highest win percentage in the competitions against Random. However, when we are testing against Random(3), our proposed framework achieves the highest win percentage. Another important point is that the effect of a nonperfect strategy, like the random strategy, gets bold when the value of budget increases, and this is why the win percentages increase as the value of increases.
vs Random  vs Random(3)  

Method  k=10  k=20  k=30  k=10  k=20  k=30 
(a) CAHepTh  
CBA(k)  77.1  97.4  99.8  51.6  56.2  63.1 
CBA(k+5)  82.8  97.8  99.9  56.7  60.4  66.6 
CBA(k+10)  82.3  98.3  99.8  58.7  60.7  66.4 
1each  86.0  99.1  99.9  54.9  57.0  65.5 
2each  76.3  93.3  98.1  47.4  47.7  51.8 
(b) Facebook  
CBA(k)  57.0  70.0  73.2  52.4  68.0  73.9 
CBA(k+5)  56.6  70.8  71.9  51.8  63.3  74.6 
CBA(k+10)  54.3  70.6  71.0  49.9  64.9  76.6 
1each  50.6  54.3  59.1  13.7  8.1  5.7 
2each  78.2  72.0  67.3  58.9  51.9  48.5 
(c) p2p  
CBA(k)  80.3  93.1  99.4  52.1  50.7  55.3 
CBA(k+5)  80.5  97.0  99.5  56.9  58.5  59.9 
CBA(k+10)  81.3  96.9  99.8  55.7  58.6  63.0 
1each  83.4  99.3  100.0  49.5  49.0  46.9 
2each  73.5  91.4  99.3  49.8  55.5  49.4 
(d) WikiVote  
CBA(k)  68.9  93.8  98.6  51.0  58.6  63.7 
CBA(k+5)  72.0  94.4  99.0  52.3  58.5  60.1 
CBA(k+10)  73.4  93.5  99.0  50.2  57.5  62.4 
1each  77.2  97.9  99.9  40.8  32.9  40.2 
2each  74.9  90.3  96.5  53.6  48.3  49.7 
In the next experiment, we conduct competitions between our method and 1each, 2each strategies. The average (avg.) and standard deviation (std.) of the
in 1000 repeated competitions alongside with the win percentage are reported in Table 5. For our framework, we have only reported CBA() as the value of has performed slightly better than other values of . According to the results, in some cases, our method has won the game (, and also ). In other cases, the results of the competition are a draw, , and the value of avg. is small comparing to the value of std. However, our proposed framework has not lost any of these competitions with a large margin. Note that there is a correlation between win% and avg; the value of avg. increases when win% increases.Another important point from Table 5 is the high value of the standard deviations. This is caused by two sources of randomness. First, the diffusion model is very stochastic. Second, the first player is playing the nash strategy, which is a mixed strategy, and so the actions are not deterministic and selected according to a probability distribution.
k=10  k=20  k=30  

dataset  win%  avg.  std.  win%  avg.  std.  win%  avg.  std. 
CBA(k+5) against 1each  
CAHepTh  49.8  1.0  123.8  49.1  0.8  138.0  50.2  4.0  143.2 
64.2  89.0  194.3  77.2  164.5  203.4  79.1  152.8  195.9  
p2p  53.6  20.4  504.4  54.2  49.3  432.7  47.3  14.8  350.8 
WikiVote  50.4  1.5  75.8  47.4  3.4  79.0  46.3  4.4  78.6 
CBA(k+5) against 2each  
CAHepTh  49.4  4.3  101.7  50.9  0.2  115.5  47.5  7.7  124.3 
45.5  5.7  185.3  63.8  83.9  193.2  75.7  160.1  213.5  
p2p  47.6  26.3  442.6  44.9  51.3  416.0  51.0  6.3  384.8 
WikiVote  49.7  0.4  64.9  52.3  4.2  72.6  50.3  1.8  76.6 
As the last experiment in this subsection, we conduct competitions between our method against itself with different values of . This is mainly to identify which value of performs better. Results are reported in Table 6. According to the results, none of the parameters could significantly defeat others. Therefore, it seems that there is no significant difference between different values of tested; only has slightly performed better.
The key results from the experiments conducted so far are: 1. The performance of our method was better than the 1each/2each strategies in benchmarking against Random(3). 2. We have not lost any competition against other strategies. 3. The hint that the result of seems to be marginally better than other values of . In the next subsection, we more accurately demonstrate the superiority of our framework compared to other strategies.
      
dataset  win%  avg.  std.  win%  avg.  std.  win%  avg.  std. 
CAHepTh  51.3  10.4  109.9  49.7  0.7  118.5  47.8  3.3  111.6 
51.0  5.5  201.2  49.1  3.4  204.0  51.7  0.7  209.6  
p2p  51.7  20.1  489.7  50.8  17.0  531.6  49.6  2.3  543.7 
WikiVote  47.6  1.7  70.5  50.7  0.9  74.1  53.0  3.9  69.4 
CAHepTh  51.3  7.6  122.3  49.7  5.0  133.5  48.8  1.7  130.2 
52.7  7.7  207.2  51.5  5.6  221.2  52.4  5.1  212.4  
p2p  53.4  24.4  406.3  49.5  4.1  438.6  50.4  6.8  466.4 
WikiVote  49.4  0.7  80.3  49.5  2.3  80.9  46.7  3.9  79.7 
CAHepTh  48.7  4.8  140.2  51.7  2.9  138.3  50.3  0.0  134.8 
49.8  2.2  210.1  52.7  13.6  209.1  48.5  0.2  209.4  
p2p  51.4  13.9  409.8  49.6  1.8  418.8  50.2  3.3  416.1 
WikiVote  50.1  4.1  79.5  48.5  2.8  80.6  49.0  0.4  79.8 
7.3.2 Why should one company adopt our framework?
From a gametheoretic point of view, in the symmetric case, which is the case of these experiments, the Nash equilibrium (proposed framework) guarantees not to be the absolute loser of any competition (refer to Section 6). However, for any nonnash strategy, there exists at least one strategy that can definitely defeat it. This is the main superiority of our framework.
To illustrate the point above, we formed competitions between 1each, 2each, Random(2), and Random(3) strategies against their pure best responses calculated by our best response oracle. Our method is also evaluated against these best responses and its own best response. Like before, each competition is repeated 1000 times, and results are reported in Table 7. The results of this experiment demonstrate how badly these strategies have lost the game to their best responses. In the Facebook dataset, for example, none of these methods could win any competition, win% of 0. However, our method’s performance against these best responses and its own best response is completely following what was guaranteed.
k=10  k=20  k=30  

dataset  win%  avg.  std.  win%  avg.  std.  win%  avg.  std. 
1each against BestResponse(1each)  
CAHepTh  34.5  33.0  93.9  24.0  71.2  99.1  20.6  82.2  108.7 
0  506.3  92.0  0  642.8  110.3  0  658.9  101.4  
p2p  24.1  246.9  370.1  8.5  414.1  296.4  2.7  498.6  262.0 
WikiVote  6.9  67.9  45.2  1.9  106.9  48.9  0.4  118.0  46.0 
CBA(25) against BestResponse(1each)  
CAHepTh  49.4  4.3  101.7  50.9  0.2  115.5  47.5  7.7  124.3 
45.5  5.7  185.3  63.8  83.9  193.2  75.7  160.1  213.5  
p2p  47.6  26.3  442.6  44.9  51.3  416.0  51.0  6.3  384.8 
WikiVote  49.7  0.4  64.9  52.3  4.2  72.6  50.3  1.8  76.6 
2each against BestResponse(2each)  
CAHepTh  20.9  46.7  71.3  7.1  127.5  93.4  2.1  195.4  96.4 
0  271.3  88.6  0  635.8  102.6  0  751.0  98.4  
p2p  15.4  372.4  387.4  2.9  726.1  375.4  0.5  868.5  304.9 
WikiVote  4.2  67.5  41.2  0  143.1  53.6  0  193.9  46.7 
CBA(25) against BestResponse(2each)  
CAHepTh  81.2  79.1  94.4  72.1  60.9  104.5  80.3  90.4  112.1 
61.3  66.3  201.8  59.7  19.1  202.7  60.5  63.7  188.7  
p2p  80.6  326.7  383.9  61.8  124.7  375.0  57.2  74.6  351.1 
WikiVote  63.1  27.8  66.9  53.5  10.7  71.5  66.6  30.5  67.7 
Random(2) against BestResponse(Random(2))  
CAHepTh  40.1  30.3  111.7  32.2  50.3  122.8  32.5  58.3  129.8 
9.0  253.8  177.9  0  563.4  109.7  0  625.8  106.0  
p2p  32.4  215.5  501.1  28.4  241.8  456.2  16.0  403.0  423.7 
WikiVote  31.4  40.7  80.9  11.5  80.5  67.5  8.7  93.3  69.1 
CBA(25) against BestResponse(Random(2))  
CAHepTh  51.0  5.3  113.6  48.9  2.5  131.1  47.7  3.6  135.9 
49.0  3.1  181.5  57.9  14.5  205.1  57.1  40.7  184.0  
p2p  47.1  33.9  491.2  45.9  31.4  440.9  55.2  51.7  393.6 
WikiVote  52.6  3.3  70.4  52.1  0.8  75.0  49.0  1.9  77.7 
Random(3) against BestResponse(Random(3))  
CAHepTh  33.5  39.8  102.6  29.6  62.4  113.8  15.4  123.9  125.3 
32.5  102.5  231.8  0  483.3  102.4  0  582.7  116.8  
p2p  33.1  236.1  474.1  22.5  306.9  410.8  17.6  337.6  363.5 
WikiVote  33.5  27.6  66.7  13.7  73.8  70.9  7.9  103.8  72.1 
CBA(25) against BestResponse(Random(3))  
CAHepTh  57.1  17.3  105.1  59.9  30.6  122.4  60.1  30.8  127.7 
47.8  0.2  172.9  52.4  4.8  207.3  54.3  16.0  195.3  
p2p  54.6  43.0  464.6  57.6  75.5  414.5  51.5  0.5  393.4 
WikiVote  49.1  0.9  71.9  50.9  4.4  78.0  49.7  4.4  71.4 
7.3.3 Running Time and Number of iterations to converge
Finally, we report the running time and number of iterations it takes for the double oracle algorithm to converge in Table 8. Only the worst case of for each value of is reported. The cardinality of the action space is ; this is why we observe a significant increase in running time as we increase the value of .
Another important point is that the size of the graph does not affect the running time of the double oracle algorithm since we are solving the problem from a higher level of abstraction where the competition is formed around only the nodes in . It is the distribution of the influential value of the nodes that affect the complexity of the game and the running time. For example, in k=30, the most significant running time was under the p2p dataset (with even a looser termination condition ), which is not our largest dataset.
k=10  k=20  k=30  

dataset  iter.  time  condition  iter.  time  condition  iter.  time  condition 
CAHepTh  51  43s  converge  150  6h  converge  162  17h  
45  47s  converge  101  0.9h  converge  155  16h  
p2p  69  345s  converge  132  5h  converge  184  47h  
WikiVote  72  220s  converge  171  11h  converge  150  8h 
8 Conclusion and Future Work
In this paper, we have presented the Competitive Influence Maximization problem from a higher level of abstraction, where parties first identify the influential nodes of the network. Then, they compete over these nodes by their advertising packages. Also, we have considered the action space to be discrete. This consideration makes a considerable difference to previous studies and is more realistic. We propose an efficient framework with a novel payoff estimation method that calculates the Nash equilibrium. Our framework targets the case of two players and is entirely applicable when the players have different amounts of budget, and when their available advertising packages are different. Through several experiments, we evaluate each aspect of our work. Especially we show that any nonnash strategy would badly lose the game to its best response. However, this would not happen for our proposed framework, and it guarantees not to lose any competition in the symmetric case. This is the critical superiority of our method.
Our framework is mainly built upon the TIM+ algorithm. However, any other RRset based method like IMM can easily be mapped to our framework. Also, since the concept of RRsets applies to the Linear Threshold and Trigerring based models, our payoff method can be extended to these diffusion models.
As future works, we first intend to extend the framework to more than two players. This will not be easy since, for example, the Minimax theorem does not hold in that case. Also, so far, we could not provide a theoretical approximation guarantee for our payoff estimation method, and further work is required in that direction. Finally, analyzing other extensions that make the problem more realistic, like considering the tendency of influential customers, can be considered as future work.
References
 [1] David Kempe, Jon Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137–146. ACM, 2003.
 [2] Matthew Richardson and Pedro Domingos. Mining knowledgesharing sites for viral marketing. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 61–70. ACM, 2002.
 [3] Shishir Bharathi, David Kempe, and Mahyar Salek. Competitive influence maximization in social networks. In International workshop on web and internet economics, pages 306–311. Springer, 2007.
 [4] Tim Carnes, Chandrashekhar Nagarajan, Stefan M Wild, and Anke Van Zuylen. Maximizing influence in a competitive social network: a follower’s perspective. In Proceedings of the ninth international conference on Electronic commerce, pages 351–360. ACM, 2007.
 [5] Arastoo Bozorgi, Saeed Samet, Johan Kwisthout, and Todd Wareham. Communitybased influence maximization in social networks under a competitive linear threshold model. KnowledgeBased Systems, 134:149–158, 2017.
 [6] Huijuan Li, Li Pan, and Peng Wu. Dominated competitive influence maximization with timecritical and timedelayed diffusion in social networks. Journal of computational science, 28:318–327, 2018.
 [7] Ruidong Yan, Yuqing Zhu, Deying Li, and Zilong Ye. Minimum cost seed set for threshold influence problem under competitive models. World Wide Web, pages 1–20, 2018.
 [8] SuChen Lin, ShouDe Lin, and MingSyan Chen. A learningbased framework to handle multiround multiparty influence maximization on social networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 695–704. ACM, 2015.
 [9] Hui Li, Sourav S Bhowmick, Jiangtao Cui, Yunjun Gao, and Jianfeng Ma. Getreal: Towards realistic selection of influence maximization strategies in competitive networks. In Proceedings of the 2015 ACM SIGMOD international conference on management of data, pages 1525–1537. ACM, 2015.

[10]
Khurshed Ali, ChihYu Wang, and YiShin Chen.
Boosting reinforcement learning in competitive influence maximization with transfer learning.
In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pages 395–400. IEEE, 2018.  [11] Khurshed Ali, ChihYu Wang, and YiShin Chen. A novel nested qlearning method to tackle timeconstrained competitive influence maximization. IEEE Access, 7:6337–6352, 2018.
 [12] Antonia Maria Masucci and Alonso Silva. Strategic resource allocation for competitive influence in social networks. In 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 951–958. IEEE, 2014.
 [13] Antonia Maria Masucci and Alonso Silva. Advertising competitions in social networks. In 2017 American Control Conference (ACC), pages 4619–4624. IEEE, 2017.
 [14] Vineeth S Varma, IrinelConstantin Morărescu, Samson Lasaulce, and Samuel Martin. Marketing resource allocation in duopolies over social networks. IEEE control systems letters, 2(4):593–598, 2018.
 [15] Vineeth S Varma, Samson Lasaulce, Julien Mounthanyvong, and IrinelConstantin Morărescu. Allocating marketing resources over social networks: A longterm analysis. IEEE Control Systems Letters, 2019.

[16]
H Brendan McMahan, Geoffrey J Gordon, and Avrim Blum.
Planning in the presence of cost functions controlled by an
adversary.
In
Proceedings of the 20th International Conference on Machine Learning (ICML03)
, pages 536–543, 2003.  [17] Wei Chen, Chi Wang, and Yajun Wang. Scalable influence maximization for prevalent viral marketing in largescale social networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1029–1038. ACM, 2010.
 [18] Christian Borgs, Michael Brautbar, Jennifer Chayes, and Brendan Lucier. Maximizing social influence in nearly optimal time. In Proceedings of the twentyfifth annual ACMSIAM symposium on Discrete algorithms, pages 946–957. SIAM, 2014.
 [19] Youze Tang, Xiaokui Xiao, and Yanchen Shi. Influence maximization: Nearoptimal time complexity meets practical efficiency. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pages 75–86. ACM, 2014.
 [20] Shimon BenIshay, Alon Sela, and Irad BenGal. " spreadit": A strategic game of competitive diffusion through social networks. IEEE Transactions on Games, 2018.
 [21] Youze Tang, Yanchen Shi, and Xiaokui Xiao. Influence maximization in nearlinear time: A martingale approach. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1539–1554. ACM, 2015.
 [22] Wei Lu, Wei Chen, and Laks VS Lakshmanan. From competition to complementarity: comparative influence diffusion and maximization. Proceedings of the VLDB Endowment, 9(2):60–71, 2015.
 [23] Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. Costeffective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 420–429. ACM, 2007.
 [24] Amit Goyal, Wei Lu, and Laks VS Lakshmanan. Celf++: optimizing the greedy algorithm for influence maximization in social networks. In Proceedings of the 20th international conference companion on World wide web, pages 47–48. ACM, 2011.
 [25] Yoav Shoham and Kevin LeytonBrown. Multiagent systems: Algorithmic, gametheoretic, and logical foundations. Cambridge University Press, 2008.
 [26] Hans Peters. Game theory: A Multileveled approach. Springer, 2015.
Comments
There are no comments yet.