With the wide popularity of social media and social network sites such as Facebook, Twitter, WeChat, etc., social networks have become a powerful platform for spreading information among individuals. Thus, influential users always play important role in a social network. Motivated by this background, influence diffusion in social networks has been extensively studied [9, 17, 4]. Most of previous works focus on exploring influential nodes. To the best of our knowledge, there is no study about the “stability” of influential nodes (seed set) when they are treated as a coalition.
Consider the following scenario. A group of influential people in a social network are considering forming a coalition so that they can better serve many advertisers through viral marketing in the social network. To make the coalition stable, we need to design a fair profit allocation scheme among the members of the coalition, such that no individual or a subset of people have incentive to deviate from this coalition, thinking that the allocation to them is unfair and they could earn more by the deviation and forming an alliance by themselves. A useful and mature framework of studying such incentives for stable coalition formation is the cooperative game theory, and in particular the coreness (core and its related concepts) of the cooperative games[7, 20].
First we will motivate our consideration of the truncated submodular functions here. In the above social influence scenario, the typical way of measuring the contribution of any set of influential people is by its influence spread function , which measures the expected number of people in the social network that could be influenced by under some stochastic diffusion model. Extensive researches have been done on stochastic diffusion models, and it has been shown that under a large class of models is both monotone and submodular111A set function is monotone if for all , and is submodular if for all and . [17, 22, 4]. However, the advertisers would only be interested in the coalition as a viral marketing platform when the influence spread reaches certain scale level. In other words, the coalition can only receive profit after the influence spread is above a certain scale threshold . Therefore, the true profit function for the coalition is when , and otherwise. We call such truncated submodular functions.
Both submodularity and scale effect are common in economic behaviors beyond the above example of viral marketing in social networks. Therefore, considering truncated submodular functions as the profit functions is reasonable. In this paper, we study the computational issues related to the coreness of cooperative games with truncated submodular profit functions.
Solution Concepts in Cooperative Games. A cooperative game consists of a player set and a profit function with . A subset of players is called a coalition and is called the grand coalition. For each coalition , represents the profit obtained by
without help of other players. An allocation over the players is denoted by a vectorwhose components are one-to-one associated with players in , where is the value received by player under allocation . For any player set , we use the shorthand notation . A set of all allocations satisfying some specific requirements is called a solution concept.
The core [12, 25] is one of the earliest and most attractive solution concepts that directly addresses the issue of stability. The core of a game is the set of allocations ensuring that no coalition would have an incentive to split from the grand coalition, and do better on its own. More precisely, the core of a game (denoted by ), is the following set of allocations: =. In practice, core is very strict and may be even empty in some cases. When is empty, there must be some coalitions becoming dissatisfaction since they can obtain more benefits if they leave the grand coalition and work as a separated team. In this case, we use the dissatisfaction degree (or dissatisfaction value), defined as , to capture the instability of player set with respect to the allocation . Then, the overall stability of the game can be measured as either the worst-case or average-case dissatisfaction degree, for which we consider the following three versions.
The first one is the relative least-core value () , which reflects the relative stability, i.e. the minimum value of the maximum proportional difference between the profits and the payoffs among all coalitions.
Given a cooperative game , the relative least-core value of () is . Technically, is the optimal solution of the following linear programming:
is the optimal solution of the following linear programming:
The second one is the absolute least-core value (  which reflects the absolute stability, i.e. the minimum value of the maximum difference between the profits and the payoffs among all coalitions. The formal definition is as following.
Given an cooperative game , the absolute least-core value of () is . Technically, is the optimal solution of the following linear programming:
The above two classical least-core values capture the stability from the perspective of the most dissatisfied coalition i.e. the worst case of stability. Sometimes the worst case is too extreme to reflect the real stability. Thus, we introduce the least average dissatisfaction value () which reflects the minimum value of average dissatisfaction degree among all coalitions.
Given a cooperative game , the least average dissatisfaction value of () is . Technically, is the optimal value of the following linear programming:
In this paper, we consider the following computational problems in the context of truncated submodular functions: (a) Whether the core of a given cooperative game is empty? (b) How to find an allocation in core if the core is not empty? (c) If the core is empty, how to compute the relative least-core value, the absolute least-core value and the least average dissatisfaction value of a cooperative game?
Contributions. We study coreness (solution concepts related to core) of truncated submodular profit cooperative game . We consider computational properties of the core, the relative least-core value, the absolute least-core value and the least average dissatisfaction value of , which are denoted by , , and , respectively.
We first prove that checking the non-emptiness of can be done in polynomial time. Moreover, we can find an allocation in the core if the core is not empty. Next, we consider the case when the core is empty. For the problem of computing the relative least-core value (), we show that it is in general NP-hard, but when truncation threshold , there is a polynomial time algorithm. Along the way, we also find an interesting partial result showing that there is no polynomial time separation oracle for the ’s linear program unless P=NP, which is of independent interest since it reveals close connections with a new class of combinatorial problems. For the absolute least-core value problem , we prove that finding is APX-hard even when is defined as the influence spread under the classical independent cascade (IC) model in social network. We also prove that there exists a polynomial time algorithm which can guarantee an additive term approximation. Finally, for the least average dissatisfaction value problem
, we show that we can use the stochastic gradient descent algorithm to computeto an arbitrary small additive error.
Related Work. Cooperative game theory is a branch of (micro-)economics that studies the behavior of self-interested agents in strategic settings where binding agreements between agents are possible . Numerous classical studies about cooperative game provide rich mathematical framework to solve issues related to cooperation in multi-agent systems [8, 16, 6].  studies the approximation of the absolute least core value of supermodular cost cooperative games, the results in this paper can be generalized to submodular profit cooperative games. An important application of our study is to analyze the stability of influential people in social networks. Almost all the existing studies focus on selecting seed set [5, 13, 26]. To the best of our knowledge, there is no literature considering the stability of the selected seed set. We utilize cooperative game theory to analyse the stability of seed set, and generalize it to a generic cooperative game with truncated submodular functions. The truncated operation represents the “threshold effect” which has been studied widely in literature[14, 1].
2 Model and Problems
2.1 Cooperative Games with Truncated Submodular Profit Functions
A truncated submodular profit cooperative game is denoted by . In , is the player set and is the profit function which is defined as follows:
Note that is a nonnegative monotone increasing submodular function with and is a nonnegative threshold. To express clearly, in the left of this paper, a truncated submodular profit cooperative game is denoted by a triple form .
Note that the explicit representation of might be exponential in the size of . The standard way to bypass this difficulty is to assume that is given by a value oracle.
2.2 Computational Problems on the Coreness
Given an truncated submodular profit cooperative game , we focus on the following problems:
CORE: Is and how to find an allocation in when ?
ALCV: When , how to compute ?
RLCV: When , how to compute ?
LADV: When , how to compute ?
Before we analyze the above problems, we introduce a specific instance of truncated submodular profit cooperative game (see Section 2.3).
2.3 Influence Cooperative Game ()
As the description in our introduction, an important motivation of our model is influence in social networks. In this section, we introduce a specific instance of truncated submodular profit cooperative game, influence cooperative game.
Social graph. A social graph is a directed graph , where is the vertex set and is the edge set. and
is the influence probability on each edge. Note that, and denote the vertex set of influential people and target people in , respectively.
Influence diffusion model. The information diffusion process follows the independent cascade (IC) model proposed by . In the IC model, discrete time steps are used to model the diffusion process. Each node in has two states: inactive or active. At step 0, nodes in seed set are active and other nodes are inactive. For any step , if a node is newly active at step , has a single chance to influence each of its inactive out-neighbor with independent probability to make active. Once a node becomes active, it will never return to the inactive state. The diffusion process stops when there is no new active nodes at a time step. For any , we use to denote the influence spread of , the expected number of activated nodes in from seed set , at the end of an IC diffusion. According to , is a monotone submodular function.
An influence cooperative game is a special form of the truncated cooperative game, with as the player set, and the truncation of influence spread function as the profit function.
In the rest of this paper, we analyze problems defined in Section 2.2 one by one. Note that our positive results (properties and algorithms) could apply to all truncated submodular profit cooperative games including influence cooperative game. Our hardness results are established for the influence cooperative games, so it is stronger than the hardness results for general truncated submodular cooperative games.
3 Computing Core
We start by considering the core of (). In , we say a player is a veto player if for any . That is to say, a successful coalition must include all veto players.
if and only if:
(i) There exists at least one veto player in , or
(ii) , for any .
Proof. Suppose the player set of is . We first prove the sufficiency of Lemma 1. On one hand, suppose is a veto player of , then we can find a trivial allocation in : and , . On the other hand, ( ) is an allocation in if .
Now we prove the necessity. Suppose and . Let , where is the marginal increasing of player . If there is no veto player, then for any , since is monotone. Thus, , . Suppose , where . Note that since is submodular. By the definition of the core, for any , we have: . That is,, .
Summing up these inequalities for all , we have, .
Deciding whether is empty can be done in polynomial time and an allocation in can be computed in polynomial time if is not empty.
Proof. [Sketch] First, it takes polynomial time to check the non-emptiness of . When is not empty, then when is a veto player and when satisfies. The detail proof of Theorem 1 is shown in the appendix.
4 Computing Relative Least-Core Value
From Lemma 1, may be empty in many cases. It is obvious that if and otherwise. In this section, we study computational properties of RLCV problem. The linear programming corresponding to (LP-RLCV) is as follows:
A special case of computing is when . It captures the scenario that the profit of any coalition exactly equals to its influence spread under influence cooperative game. In Theorem 2 we show that, although there are exponential number of constraints, LP-RLCV can be solved in polynomial time by providing a polynomial time separation oracle when . A separation oracle for a linear program is an algorithm that, given a putative feasible solution, checks whether it is indeed feasible, and if not, outputs a violated constraint. It is known that a linear program can be solved in polynomial time by the ellipsoid method as long as it has a polynomial time separation oracle .
There exists a polynomial time separation oracle of LP-RLCV when . Therefore, RLCV can be solved in polynomial time when .
Proof. Given any solution candidate of LP-RLCV , we need to either assert is a feasible solution or find a constraint in LP-RLCV such that violates it. Note that, checking and () can be done in polynomial time. Thus, we only need to check whether .
An important property is achieves its maximum value when contains only one single player. This is because . The first inequality is due to the submodularity of and the second inequality is due to , . Thus, the exponential number of constraints can be simplified to constraints on all single players. Then, we can find a polynomial time separation oracle of LP-RLCV directly.
When , RLCV can be solved in polynomial time is mainly because the most dissatisfaction coalition is a single player. However, when , it becomes intractable to find the most dissatisfaction coalition.
There is no polynomial time separation oracle of LP-RLCV for some , unless P=NP.
Theorem 3 can not imply the NP-hardness of RLCV. However, the proof of Theorem 3 reveals an interesting connection between RLCV problem and a series of well defined combinatorial problems. We will report the proof of Theorem 3 and the generalized combinatorial problems in the appendix.
In the left of this section, we prove the NP-hardness of RLCV, a stronger hardness result than which in Theorem 3.
It is NP-hard to compute , even under influence cooperative game.
Proof. [Sketch] We construct a reduction from the SAT problem. A boolean formula is in conjunctive normal form (CNF) if it is expressed as an AND of clauses, each of which is the OR of one or more literals. The SAT problem is defined as follows: given a CNF formula , determine whether has a satisfiable assignment. Let be a CNF formula with clauses , over literals . Without loss of generality, we set .
We construct a social graph as follows:
is a tripartite graph (see the sketch graph in Figure 1). In the first layer (), there are two nodes and corresponding to each , dummy nodes labeled as and dummy nodes labeled as . In the second layer (), there are two nodes and corresponding to each , one node for each and a dummy node . The third layer () contains only node . Edges exist only between the adjacent layers. For each , sends an edge to every node in clause contains literal . Similarly, for each , sends an edge to every node in clause contains literal . The probabilities on edges sent form and are 1. There is an edge with influence probability 1 from to for any and edges form to . There is an edge from to with influence probability for any . There is also exists an edge from to with influence probability for any . The left edges are from to all nodes in the second layer. The influence probability on edge is and all other probabilities on edges sent from is 1. The influence cooperative game defined on is . For convenient, we set .
Suppose is the optimal solution of the relative least-core value of We can prove that if is satisfiable and if is un-satisfiable. The proof of this part is shown in the appendix.
5 Computing Absolute Least-Core Value
5.1 Hardness of ALCV
ALCV problem of influence cooperative game cannot be approximated within 1.139 under the unique games conjecture.
Proof. [Sketch] We construct a reduction from MAX-CUT problem. Under our construction, for any instance of MAX-CUT problem, we can construct an instance of ALCV problem such that the optimal solution of these two instances are equal. The detail proof is shown in our appendix.
In this section, we approximate by approximating the following linear programming (LP-PRIME):
The intractability of LP-PRIME lies on the exponential number of constraints and the hardness of identifying all successful coalitions. We use a relaxed version LP-RE and a strengthen version LP-STR of LP-PRIME to design an approximation algorithm of . (5) and (6) are formal definitions of LP-RE and LP-STR, respectively.
Intuitively, LP-RE and LP-STR denote absolute least-core values of two cooperative games with new profit functions. Specifically, LP-RE relaxes the constraints in LP-PRIME by reducing the profits of all successful coalitions excepting to . Formally, the profit function in LP-RE is : , , if and otherwise. The profit function in LP-STR is , . Clearly, LP-STR strengthens LP-PRIME by increasing the profits of all unsuccessful coalitions.
Our main result in this section is shown in Theorem 6.
, there exists an approximate algorithm of the problem with running time in , outputs such that ,.
Suppose the optimal value of LP-PRIME, LP-RE and LP-STR are , and , respectively. Then, we have
There exists a polynomial time approximate algorithm of LP-STR outputting such that .
, there exists an algorithm of LP-RE outputting such that , with runs time in .
6 Computing Least Average Dissatisfaction Value
Based on Definition 3, equals the optimal value of the following linear programming:
Where if and otherwise. There are exponential terms in , however, we can utilize stochastic gradient algorithm to approximate the optimal solution of (9). This is because the object function is a convex function (Lemma 5) and the feasible solution area in (9) is a convex set.
is a convex function.
, if and in Algorithm 1.
7 Conclusion and future work
In this paper, we study the core related solution concepts of truncated submodular profit cooperative game. One possible future work is to change the way of truncating a function. For example, we can set if and otherwise. This setting is a special case of the setting in our paper and maybe we can try to design algorithms for it. In this paper, we prove that computing the relative least-core value is NP-hard. We also prove that the relative least-core value can be solved in polynomial time in a special case. A directly future work is to design an approximate algorithm of RLCV under general case.
Appendix A Appendix of section 3
Proof. [Proof of Theorem 1]
We can design the following polynomial time process to check the emptiness of and find an allocation in when .
Step 1: Query for all from value oracle. If there exists such that , go to Step 2, otherwise, go to Step 3.
Step 2: Return .
Step 3: Query and for all from value oracle. If , go to Step 4, otherwise, go to Step 5.
Step 4: Return .
Step 5: Assert that .
Appendix B Appendix of Section 4
Proof. [Proof of Theorem 3] We construct a reduction from NP-complete problem DOMINANT-SET . Given an undirected graph and an integer , the DOMINANT-SET problem concerns testing whether there exists a dominant set of with size no more than . A dominant set is a subset such that each vertex in is adjacent to at least one vertex in .
Given any instance of DOMINANT-SET problem , we construct a social graph as follows: The vertex set in is , where . For each node and , there is a directed edge in if and only if either in or . The influence probability on each edge is 1.
The influence cooperative game defined on is . Thus, the linear programming corresponding to the relative least-core value of is:
Now we prove that the DOMINANT-SET problem can be solved in polynomial time if there exists a polynomial time separation oracle of (10). Given a candidate solution , where for any and . Suppose there exists a polynomial time separation oracle of (10). Then , we can decide whether in polynomial time. Note that is the set of all dominant sets of . Thus, for any ’s dominant set , is a feasible solution if and only if . In other words, having , we can decide whether there exists a dominant set with size no more than .
In remark 1
, we introduce a class of combinatorial optimization problems inspired from the proof process of Theorem3.
We define an adversarial version of the classical weighted set cover problem: Given a ground set , a collection of subsets , a weight budget . The objective of the adversarial weighted set cover problem is to allocate weight among subsets in such that the minimum weight of all set covers is maximum.
Formally, the objective of the adversarial weighted set cover problem is:
where is a nonnegative allocation vector.
Similarly, we can define adversarial weighted vertex cover problem, adversarial weighted dominant set problem, and so on.
The following argument shows that the adversarial weighted set cover (dominant set, vertex cover, etc.) problem is a special instance of RLCV. When , can be denoted more compactly:
Thus, it is enough to compute
Given any instance of set cover problem, similar to the construction in the proof of Theorem 3, it is not difficult to construct a social graph such that equals the number of elements covered by , for any collection . Thus, the adversarial weighted set cover problem is a special instance of RLCV problem.
Proof. [Proof of Theorem 4] We construct a reduction from the SAT problem. A boolean formula is in conjunctive normal form (CNF) if it is expressed as an AND of clauses, each of which is the OR of one or more literals. The SAT problem is defined as follows: given a CNF formula , determine whether has a satisfiable assignment. Let be a CNF formula with clauses , over literals . Without loss of generality, we set .
We construct a social graph as follows:
is a tripartite graph (see the sketch graph in Figure 2). In the first layer (), there are two nodes and corresponding to each , dummy nodes labelled as and dummy nodes labelled as . In the second layer (), there are two nodes and corresponding to each , one node for each and a dummy node . The third layer () contains only node . Edges exist only between the adjacent layers. For each , sends an edge to every node in clause contains literal . Similarly, for each , sends an edge to every node in clause contains literal . The probabilities on edges sent form and are 1. There is an edge with influence probability 1 from to for any and edges form to . There is an edge from to with influence probability for any . There is also exists an edge from to with influence probability for any . The left edges are from to all nodes in the second layer. The influence probability on edge is and all other probabilities on edges sent from is 1. The influence cooperative game defined on is . For convenient, we set .
Under the above construction, if is satisfiable and the corresponding assignment is . Let , . Thus, can active all nodes in the second layer except and can active all nodes in . Therefore, there are three disjoint successful coalitions , and which means , and . Suppose is the optimal solution of the relative least-core value of and is an optimal allocation. We can prove . This conclusion can be derived by separately considering cases , and .
When is un-satisfiable, it is sufficient for our proof if we can find an solution such that . Note that when is un-satisfiable, then for any , if . Otherwise, we can construct an assignment such that is satisfiable. Let and for any , .
The left is to prove that there exists a positive such that for any satisfying .
We prove the above inequality by considering the following two cases:
(i) : In this case,