Covert Networks: How Hard is It to Hide?

03/14/2019 ∙ by Palash Dey, et al. ∙ IIT Kharagpur The Regents of the University of California 0

Covert networks are social networks that often consist of harmful users. Social Network Analysis (SNA) has played an important role in reducing criminal activities (e.g., counter terrorism) via detecting the influential users in such networks. There are various popular measures to quantify how influential or central any vertex is in a network. As expected, strategic and influential miscreants in covert networks would try to hide herself and her partners (called leaders) from being detected via these measures by introducing new edges. Waniek et al. show that the corresponding computational problem, called Hiding Leader, is NP-Complete for the degree and closeness centrality measures. We study the popular core centrality measure and show that the problem is NP-Complete even when the core centrality of every leader is only 3. On the contrary, we prove that the problem becomes polynomial time solvable for the degree centrality measure if the degree of every leader is bounded above by any constant. We then focus on the optimization version of the problem and show that the Hiding Leader problem admits a 2 factor approximation algorithm for the degree centrality measure. We complement it by proving that one cannot hope to have any (2-ε) factor approximation algorithm for any constant ε>0 unless there is a ε/2 factor polynomial time algorithm for the Densest k-Subgraph problem which would be considered a significant breakthrough.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Social network analysis (SNA) has played a pivotal role in many applications in multi-agent systems and artificial intelligence 

[SS02, OR02, WCZM07, CSW05]. One of the most successful applications of SNA is in counter-terrorism via analyzing covert networks [CAX05, Res06, XC05, LLPC10]. Covert network loosely refers to network of criminals, terrorists, illegal activities, etc. Security personnel regularly use various SNA tools to understand criminal behavior, catch their leaders, and effectively dismantle such networks [Eis18, FKB17, Kno15].

Centrality measure is one of the most useful tools that SNA provides to analyze covert networks. It assigns scores to the vertices based on their relative influence or importance in the network [Bav48]; depending on the centrality measure, higher scores may correspond to important vertices and important vertices are expected to be more central. One of the simplest and oldest such centrality measures is the degree centrality which ranks vertices according to their degree [Sha54]. Other important examples include closeness centrality and betweenness centrality that are measures based on shortest paths [Bea65]. Another centrality measure is the core centrality [Sei83] which ranks the vertices based on their core number. Intuitively speaking, if a vertex has a high core number, then it is part of some dense cohesive community within the network. Formally, a -core is an induced subgraph of the network where the minimum degree of the vertices is at least . The core number of a vertex is the highest integer such that the vertex is part of some -core. Therefore, the core centrality can be more revealing about the position of a node than its degree centrality—while degree centrality only concerns about the degree of a vertex, the core centrality elegantly takes into consideration the degrees of the neighbors as well as the vertex. These two measures are also related in the sense that the core centrality of any vertex is at most its degree centrality. Due to its sophisticated nature, the core centrality has been extensively used in the study of covert networks [MGP07, SJ08, Mem12] as well as in other important tasks such as viral marketing and social engagement [KGH10, BKL15] in social networks.

In this paper, our goal is to study the centrality measure based secrecy in covert networks. Indeed, understanding covert networks remains a challenging task mainly due to incompleteness and dynamic evolution of the data as well as the strategic nature of the users  [Kre02, Spa91, Sag04, ABD15, RE16, BF93]. Since the criminals often possess technical expertise [JZV16, SC14, CEHS12, Cal17, AADF17], we are interested in the evolution of terrorist networks under a framework of strategic users [WMRW17]: How is the network designed to hide the central or influential users aka the leaders?

Waniek et al. [WMRW17] first propose the Hiding Leader problem which incorporates the viewpoint of the leaders of a criminal organization. It also explicitly models knowledge of the criminals about SNA tools that are used to detect them and thus help in dismantling their organization. Intuitively, the input in the Hiding Leader problem is a network with a subset of vertices marked as leaders. The goal is to add fewest edges to ensure that various SNA tools do not rank any leader high based on centrality measures thereby capturing the efficiency vs secrecy dilemma that the criminals are believed to possess [MGP07, CLWU13, VL15]. Waniek et al. show promising results that the Hiding Leader problem is computationally intractable even for the simplest degree centrality measure.

1.1 Contribution

In this paper, we study the Hiding Leader problem for the core centrality measure and show that the degree centrality measure is much more computationally vulnerable than the core centrality measure although the Hiding Leader problem is -complete for both of them. We reinforce our above claim further through extensive empirical evaluations. Our specific contribution in this paper are as follows.

  • [itemsep=.2cm]

  • We show that the Hiding Leader problem for degree centrality is polynomial time solvable if the degree of every leader is bounded by some constant [Theorem 2].

  • We present a factor approximation algorithm for the Hiding Leader problem for degree centrality which optimizes the number of edges added [Theorem 3]. We complement this by proving that, if there exists a factor approximation algorithm for the above problem for any constant , then there exists a factor approximation algorithm for the Densest -Subgraph problem [Theorem 4] which would be considered a substantial breakthrough. To the best of our knowledge, the state of the art algorithm for the Densest -Subgraph problem achieves an approximation ratio of only [BCV12].

  • For the core centrality measure, we show that the Hiding Leader problem is -complete even if the core centrality of every leader is exactly [Theorem 5]. We prove that our result is almost tight in the sense that the Hiding Leader problem is polynomial time solvable if the core centrality of every leader is at most [Proposition 7]. Moreover, we also prove that there does not exist any factor approximation algorithm for any constant which optimizes the number of edges that one needs to add even when the core centrality of every leader is [Corollary 6].

  • We show that a construction of a network by Waniek et al. [WMRW17], called “captain network” there, hides the leaders with respect to the core centrality measure also.

  • We empirically evaluate our -approximation algorithm for the degree centrality measure in synthetic networks. We observe that our algorithm almost always produces near optimal results in practice. In the experimental results, we also show the extent in which a leader can hide in the captain network with respect to core centrality.

1.2 Related Work

Waniek et al. first proposed and studied the Hiding Leader problem [WMRW17, WMWR18]. They proved that the problem is -complete for both the degree and closeness centrality measures. They also proposed a procedure to design a captain (covert) network from scratch which not only hides the leaders based on the degree, closeness, and betweenness centrality measures, but also keeps the influence of the leaders high in the network. In this paper, we provide two approximability results for degree centrality and core centrality respectively. We also show the problem is harder in the case of core centrality. Liu et al. [LT08] studied another related problem to make the degree of each node in the network beyond a given constant by adding minimal edges.

Other problems that align with privacy issues in social networks were studied before [ZG09, AAE13]. In [ZG09], the authors showed how an adversary exploits online social networks to find the private information about users. Altshuler et al. [AAE13] discussed the threat of malware targeted at extracting information in a real-world social network.

Computing centrality and related problems. A significant amount of related work study the computationally complexity of various centrality measures. Brandes [Bra01] first proposed an efficient algorithm to compute the betweenness centrality of a vertex in a network. More recently, Riondato et al. [RK14] introduced an approach to compute the top- vertices according to the betweenness centrality using VC-dimension theory. Yoshida [Yos14] studied similar problems for both the betweenness and coverage centrality measures in a group setting. Mahmoody et al. subsequently improved the performance of the above algorithms using a novel sampling scheme [MCU16]. There is an active line of research to optimize the centrality of one node as well as of a set of nodes [CDSV15, IETB12, DSV16, MSS18]. Nikos et al. proposed a novel procedure to maximize the expected decrease in shortest path distances from a given node to the remaining nodes via edge addition [PPT16]. Crescenzi et al. [CDSV15] proposed greedy algorithms to increase centrality of certain vertices and show effectiveness of their approach through extensive simulation. Kilberg [Kil12] and others studied behavioral models to understand why certain network topologies are common in covert networks [CEHS12, DK12, BFCB15]. Enders and Su [ES07] and others develop models to explain various properties like efficiency vs secrecy dilemma etc. of covert networks [JM12, LBH09, DKS14, EJ10]. Other important direction includes quantifying the influence of vertices; most prominent among them include Independent Cascade model [GLM01], Linear Threshold model [KKT03], Bass model [Bas69, MI06], etc.

Other network design problems. We also provide a few details about previous work on other network modification (design) problems. A set of design problems were introduced in [PS95]. Lin et al. [LM15] addressed a shortest path optimization problem via improving edge weights on undirected graphs. The node version of this problem was also studied [DLG11, MVRS18, MBS18]. Meyerson et al. [MT09] proposed approximation algorithms for single-source and all-pair shortest paths minimization. Faster algorithms for some of these problems were also presented in [PBG11, PPT15]. Demaine et al. [DZ10] minimized the diameter of a network by adding shortcut edges. Dey et al. [DKN19] studied the social network effect in the surprise in elections.

2 Preliminaries

For a positive integer , we denote the set by . A network or graph is a tuple consisting of a finite set (or ) of vertices and a set of edges (also denoted by ). A network is called undirected if we have whenever we have for any with . A self loop is an edge of the form for some . In this paper, we focus on undirected networks without any self loop. The degree of a vertex is the number of edges incident on it which is . A subgraph of a network is a network such that and . For a positive integer , a subgraph of a network is called a -core if the degree of every vertex in at least . The core number of a vertex in a network is the largest integer such that belongs to a -core.

2.1 Network Centrality

Let be any network. Bavelas [Bav48] introduces the notion of centrality of vertices. Intuitively, centrality measures try to capture the importance of a vertex in a network. Shaw [Sha54] proposes the degree centrality measure which has turned out to be one of the most useful measures. The degree centrality of a vertex in is the degree of in the network, that is .

Seidman [Sei83] introduces the idea of core centrality which is particularly useful for finding network cohesion. For an integer , a -core is a subgraph of such that the degree of every vertex in is at least . The core number of a vertex in is the largest such that belongs to a -core, that is . The core centrality of a vertex in is its core number in the network. Other popular network centrality measures includes closeness centrality [Bea65], betweenness centrality [Ant71, Fre77], etc.

2.2 Problem Definition

Intuitively, the input in the Hiding Leader problem is a network with a subset of vertices marked as leaders (and the other vertices are followers), a budget which is the maximum number of edges that we can add in the network, and a target which is the minimum number of followers whose centrality must be at least as high as the centrality of any leader in the resulting network (after addition of the new edges). We now define our problem formally. In Definition 1, denote either degree centrality or core centrality.

Definition 1 (Hiding Leader (Hl)).
Given a graph , a subset of leader vertices, an integer denoting the maximum number of edges that we are allowed to add in , an integer denoting the number of follower vertices in whose final centrality should be at least as high as any leader, the goal is to compute if there exists a subset of edges between followers such that the conditions below hold. such that
where .

3 Results for Degree Centrality

We present our algorithmic and hardness results for the Hiding Leader problem for the degree centrality measure in this section. We begin with presenting our polynomial time algorithm for the Hiding Leader problem for the degree centrality measure when the degree of every leader in the network is bounded above by any constant. On a high level, our algorithm makes greedy choices as long as it can and uses local search technique when “stuck.”

Theorem 2.

There exists a polynomial time algorithm for the Hiding Leader problem for degree centrality if the degree of every leader is bounded by any constant.

Proof.

Let be the input graph and the highest degree of any leader; that is . We are given that is a constant. If the number of followers is at most , then there are at most (which is a constant) new edges that we can add and we try all possible subsets of it of cardinality at most . The number of such subsets is at most which is a constant and thus we can output correctly in polynomial time. So let us assume that the number of followers is at least . Similarly we can also assume without loss of generality that the budget is at least since otherwise we will try to add all possible new edges (there are only possibilities since ) and thus we can output correctly in polynomial time.

Suppose there are already number of followers in whose degrees are at least . If , we output yes. Otherwise let us assume without loss of generality that . Let with be the set of top highest degree followers in the network whose degrees are less than . Intuitively, our algorithm greedily adds new edges between two vertices in whose degrees are less than until it is stuck and removes some edge it had added before to make progress. Concretely, our algorithm works as follows. To distinguish existing (old) edges from newly added edges (by the algorithm) in , we color the existing (old) edges as red and whenever we add a new edge, we color it green. To begin with, all the edges in are colored red and there is no green edge. We apply the following step () as long as we can. If the number of green edges in is less than and there exist two vertices such that the degrees of both the vertices are less than and there is no edge between them, then we add an edge in and color it green. Such a pair of vertices (if exists) can be found in . If such a pair of vertices does not exist in , then one of the following four cases must hold.

Case : The degree of every vertex in is at least . In this case, we output yes.

Case : The number of green edges in is . In this case, we output yes if the degree of every vertex in is at least ; otherwise we output no.

Case : There exists exactly one vertex with degree less than . If the degree of is , then we add an edge between and any vertex such that there is no edge between and in and color it green. If the number of green edges in is at most , then we output yes; otherwise we output no. Otherwise we assume that the degree of is less than . If the number of green edges in is at most , then we can add new green edges on and answer yes since we have already assumed that . So let us assume without loss of generality that the number of green edges in is more than . Let be a green edge such that there is no edge between and and between and in . Such a green edge always exists in since the degree of is less than in and there are more than green edges in . Moreover such an edge can be found in polynomial time by simply checking all the green edges. We now remove the green edge from , add two edges and in , color both of them green, and continue (return to the step ()).

Case : For every pair of vertices with degree less than for both the vertices, there is an edge between them. Let be the set of vertices in with degree less than . Since there exists an edge between every pair of vertices in in this case and the degree of every vertex in is less than , we have . If the number of green edges in is at most , then we can add new green edges on every vertex in and answer yes since we have and . So let us assume that the number of green edges in is more than . Let be any two vertices. Let and denote the set of neighbors of and in . Since the degrees of both and are less than , we have and . Since the degree of every vertex in is at most and the number of green edges in is at least , there exists at least one green edge in which does not incident on any vertex in (there can be at most green edges incident on any vertex in this set). Moreover such an edge can be found in polynomial time by simply checking all the green edges. We now remove the green edge from , add two edges and in , color both of them green, and continue (return to the step ()).

The algorithm always terminates in polynomial time since in every iteration, it adds a green edge and at most green edges could be added – in cases and , we have added two green edges and removed only one green edge; the algorithm terminates in cases and . Also, whenever the algorithm outputs yes, adding green edges to the graph makes the degree of at least followers in the network at least . Hence, if the algorithm outputs yes, the instance is indeed a yes instance. So, let us assume that the algorithm outputs no. Except (in case ) when there exists exactly one vertex in the network with degree less than and the degree of is , whenever we add one green edge in total (which is the same as adding two green edges and removing one green edge in cases and ), the sum of the degrees of all the vertices in increases by . Hence the number of green edges added in the graph is at most . Since the algorithm outputs no, we have . We observe that since any edge increases the degree of at most vertices, when the algorithm outputs no, the instance is indeed a no instance. Hence the algorithm is correct. ∎

We now present a simple factor polynomial time approximation algorithm for the Hiding Leader problem for degree centrality.

Theorem 3.

There exists a polynomial time algorithm (HLDA) for approximating the budget in Hiding Leader within a factor of for degree centrality.

Proof.

Let be the set of followers in whose degree centrality is at least the degree centrality of every vertex in . Let . If , then we output an empty set of edges. Let be the followers with highest degree centrality among the vertices in . We keep on adding edges with at least one end point in until the degree centrality of every is at least the degree of every vertex in in the resulting graph. When we have followers with degree at least the degree of every vertex in , we output the set of edges that we have added. The algorithm adds at most many edges where where deg() is the degree of the vertex in the input graph. Since any new edge can increase the sum of the degrees of the followers by at most , we have OPT by the choice of . Hence our algorithm approximates by a factor of . ∎

We now complement our approximation algorithm in Theorem 3 by proving that if there exists a polynomial time approximation algorithm for the Hiding Leader problem for degree centrality with approximation factor for any constant , then there exists a constant factor polynomial time approximation algorithm for the Densest -Subgraph problem. In the Densest -Subgraph problem, the input is a graph and an integer and we need to find a subgraph of on vertices with highest density. The density of a graph on vertices is the number of edges in it divided by . To the best of our knowledge, we do not know whether there exists any polynomial time algorithm which can distinguish a graph containing a clique of size from a graph where the density of every sub-graph of size is at most (any factor approximation algorithm for the Densest -Subgraph problem would be able to distinguish). In fact, none of the known algorithms can distinguish even for some sub-constant values for (see [BKRW17] and references therein). We now show that if there exists a factor approximation algorithm for the Hiding Leader problem for any constant , then there exists an factor approximation algorithm with the same running time (of the Hiding Leader algorithm).

Theorem 4.

Suppose there exists a factor polynomial time approximation algorithm for the Hiding Leader problem for degree centrality for some constant . Then there exists a polynomial time algorithm for distinguishing a graph containing a clique of size from a graph where the density of every sub-graph of size is at most .

Proof.

Let be any graph which satisfies either (exactly) one of the following properties.

  1. Completeness: There exists a clique of size in .

  2. Soundness: The density of any subgraph of of size is at most .

From we construct an instance of Hiding Leader. Intuitively, we introduce a vertex corresponding to every vertex in and the edge set among those vertices is the complement of the corresponding edge set in . To ensure that, in the resulting graph, the degree of every vertex for is , we add appropriate number of edges between and some auxiliary vertices for every ; we ensure that the degree of any such auxiliary vertex is at most which will guarantee that these auxiliary vertices are never part of any optimal solution. Finally we add a clique on a set of leader vertices so that the degree centrality of every leader is . Formally, the instance of Hiding Leader is defined as follows.

We now use the factor approximation algorithm for the Hiding Leader problem for degree centrality which outputs that there is a way to add number of edges so that there exist at least followers in whose degree is at least the degree of any leader. Let denotes the minimum number of edges that one needs to add to ensure that there exist at least followers in whose degrees are at least the degree of every leader. Then we have . We output that the graph contains a -clique if . Otherwise, we output that the density of any subgraph of on vertices is at most . We now prove correctness of our algorithm. We first observe that the degree of every leader in is , the degree of for every is , and the degree of every other vertex is at most .

  1. Completeness: Let with be a clique in . Let us consider the subset . By construction, forms an independent set in and the degree of every vertex in is . Since, adding all the edges in in makes the degree of every vertex in in the resulting graph , we have . Hence, we have .

  2. Soundness: In this case, the density of any subgraph of on vertices is at most . Let with be a set of any followers. By the construction of , we have . Hence, the minimum number of edges one needs to add to make the degree of every vertex in at least is at least . In particular, we have .

This concludes the proof of the statement. ∎

4 Results for Core Centrality

We present our results for the Hiding Leader problem for the core centrality measure in this section. Unlike in degree centrality case, the problem becomes -complete even when the core centrality of every leader is only . This is almost tight as we prove that the problem is polynomial time solvable if the core centrality of every leader is at most .

In Theorem 5 below, we prove that the Hiding Leader problem is -complete even when the core centrality of every leader is . We reduce the Set Cover problem to the Hiding Leader problem there. In the Set Cover problem, the input is a universe , a collection of subsets of , and an integer and we need to compute if there exist at most sets in SS, union of which results in . It is well known that the Set Cover problem is -complete [GJ79].

Theorem 5.

The Hiding Leader problem for the core centrality measure is -complete even when the core centrality of every leader is .

Proof.

The Hiding Leader problem for the core centrality measure is clearly in . Note that computing core centrality of a node takes polynomial time [BKL15]. To prove -hardness, we reduce from the Set Cover problem. Let be an instance of the Set Cover problem. To define a corresponding Hiding Leader problem instance, we construct the graph as follows.

Intuitively, for each subset , we create a path of vertices in ; are the edges of the above path. We also add vertices to with eight edges where the four vertices in form a clique with six edges; the other two edges are and . For each , we add a set of vertices with eight edges where the four vertices (leaders) form a clique with six edges; the other two edges are and . We also have an edge for every . We allow to add new edges and demand that the core centrality of at least followers should be at least as high as the core centrality of every leader. Figure 1 illustrates the structure of our construction for sets . We now formally describe our Hiding Leader instance.

We now claim that these two instances are equivalent. In one direction, let us assume that the Set Cover instance is a yes instance. By renaming, let us assume that the collection forms a valid set cover of the instance. We add the edges in the set in the graph . Let the resulting graph be . We claim that the core centrality of every vertex in is in . We first observe that the core centrality of every leader remains even after adding the edges in . Also, for any , if the edge is added in the graph, the core centrality of the vertices and become . Hence after addition of the edges in in , the core centrality of every vertex in becomes . Since forms a set cover for , the core centrality of every vertex in becomes . Lastly, the core centrality of every vertex in was already in and since addition of edges never decreases the core centrality of any vertex, the core centrality of these vertices are at least in . Hence the Hiding Leader instance is a yes instance.

For the other direction, let us assume that there exists a set of edges such that in the graph , the core centrality of at least vertices in is at least ; let the set of followers with core centrality at least in be . Since adding edges in the graph never decreases the core centrality of any vertex, we have . Let us consider the following subset defined as: . Since and , we have . We claim that forms a set cover for . Suppose not, then at most vertices in can belong to since does not belong to if is uncovered. Also, any vertex in does not belong to . Hence, we have which contradicts our assumption that forms a valid solution for the Hiding Leader instance. Hence the Set Cover instance is a yes instance. ∎

Figure 1: Example construction for hardness from Set Cover where . The red nodes are the leaders and the blue nodes are the followers.

Theorem 5 along with well known inapproximability result for the Set Cover problem immediately give us the following result.

Corollary 6.

There does not exists any polynomial time algorithm for approximating the number of edges one needs to add in the Hiding Leader problem for core centrality within an approximation ratio of for any constant assuming even when the core centrality of every leader is .

Proof.

The result follows from the observation that the reduction in the proof of Theorem 5 is approximation preserving and the inapproximability result for the Set Cover problem for any constant assuming  [Mos15]. ∎

We now show that the hardness result in Theorem 5 is almost tight in the sense that if the core centrality of every leader is at most in the network, then the corresponding Hiding Leader problem is polynomial time solvable.

Proposition 7.

There exists a polynomial time algorithm for the Hiding Leader problem for core centrality if the core centrality of every leader in the network is at most .

Proof.

We observe that if the degree of any vertex is at least , then its core centrality is at least . Let be the subset of followers whose core centrality is at least ; say . Hence the degree of every vertex in is . We add new edges such that the degree of at least vertices in becomes at least in the resulting graph. We output yes if ; otherwise we output no. Since any optimal solution must add at least edges, our algorithm is correct.

5 Captain Networks

In this section, we show the “captain network”, originally proposed by Waniek et al. [WMRW17], also ensures that the core centrality of any leader is at most the core centrality of any captain. They propose two constructions; one for single leader and another for multiple leaders.

5.1 For Multiple Leaders

We first describe the construction in [WMRW17]. The set of leaders forms a clique. Each leader has a corresponding group of captains and is connected to all vertices in . Assuming that , there are such sets of captains . All vertices in the captain sets are connected as a complete -partite graph. A captain serves two things: 1) It helps to hide the leader by being higher or of same centrality than the leader with maximum centrality. 2) It spreads the influence from the leader to the rest of the network. The remaining vertices are each connected to one captain from each group . The follower set in the network is . Let us call the resulting graph . We now show that the core centrality of every leader in is at most the core centrality of every captain.

Theorem 8.

Given a captain network , let denotes the minimum number of connections that a captain has with vertices from . Assuming we have at least leaders and , the core centralities of the captains are either greater or same as the leaders.

Proof.

In , the vertices in do not contribute in the core centrality of either the leaders or the captains. We observe that the degree of any vertex in is . So their core centrality can be at most . We claim that the captains and the leaders are in higher core than . Consider a induced subgraph that includes only the leaders, the captains and the edges between them. In , the degree of any captain is ; on the other hand, the degree of any leader is . So, in , all the captains and the leaders are at least in -core. This comes from the fact that all the nodes (captains and leaders) have a minimum degree of and thus they are at least in -core. Note that, as and . This implies the captains have higher degree than the leaders in . So, the captains have at least the same core-centrality as the leaders. Our claim is proved. Additionally, any vertex in is in -core and assuming . ∎

Corollary 9.

Given the same captain network , assuming and , the core centrality of all the captains is strictly larger than the leaders.

Proof.

The key idea is that the captains will form a core only among themselves and that will be higher core than the leaders. Now, for any captain the degree among themselves is . Now that, or is possible when and . So, the captains have larger core-centrality than the leaders. ∎

5.2 For Single Leader

We start with the construction in [WMRW17] when and show the core centralities of the leaders and the captains remain same in this case.

A single leader () has a corresponding two sets of captains and similarly it has . All vertices in the captain sets are connected as a complete bipartite graph. Each of the remaining vertices in is connected to one captain from each group and . The follower set in the network is .

Corollary 10.

Given the captain network described above, let denote the minimal number of connections that a captain, has with vertices from . Assuming , the core centralities of all the captains are same as the leader.

Proof.

The proof follows from that of Theorem 8. The leader has degree where as the captains have degree . But the vertices in has only degree . So the leader and the captains will be in the higher core and it will be . Assuming all the captain vertices and the leader will be in -core. If , all the vertices in the network will be in -core. ∎

6 Simulation results

In this section, we evaluate the performance of our approximation algorithm in Theorem 3 using synthetic networks. For brevity, let us call our algorithm in Theorem 3 as HLDA and called the lower bound used in Theorem 3 as LB. We also show how well the leaders can be hidden in the captain network via the core centrality measure. Solutions were implemented in Java and experiments conducted on GHz Intel cores with GB RAM.

6.1 Evaluation of 2-Approximation Algorithm in Theorem 3

(a) BA (avg. degree = 2)
(b) WS (avg. degree = 2)
(c) BA (avg. degree = 4)
(d) WS (avg. degree = 4)
(e) BA (avg. degree = 10)
(f) WS (avg. degree = 10)
Figure 2: Number of edges added () by different algorithms: LB implies a loose lower bound, HLDA is our algorithm that gives -approximation and Random denotes a random edge addition algorithm. Clearly, in both networks while varying edge density (average degree of nodes), the number of edge addition by our algorithm HLDA is almost same as that of LB.

Settings: We generate synthetic network structures from two well-studied models: (a) Barabasi-Albert (BA) [BA99] and (b) Watts-Strogatz (WS) [WS98]. While both have “small-world” property, WS do not have a scale-free degree distribution. We generate both the datasets of thousands vertices for three different edge densities: average degree of vertices as , and . In the experiments we choose leaders () randomly from the top high degree vertices.

Baselines: We compare our algorithm (HLDA) with two baselines. Our first baseline is the lower bound used in Theorem 3 which we call LB. Our second baseline is Random which denotes the number of random edges one needs to add to achieve the goal. The performance metric of the algorithms is the number of edges being added to satisfy the degree centrality requirement for followers. Hence, the quality is better when the number of edges is lower.

Results: Theorem 3 shows that our algorithm (HLDA) proposed for degree centrality gives a -approximation. However in practice it gives near optimal results. Figure 2 shows the results varying on four datasets. Note that, the axes are in logarithmic scale. In all six datasets, the number of solution edges of HLDA is similar to LB. However, Random cannot produce high quality results. Comparing the datasets (BA and WS), the algorithms (HLDA and LB) need higher number of edges in BA as the chosen leaders (randomly chosen from top degree nodes) have much higher degree than the followers due to the scale-free degree distribution.

Figure 3: Summary of the difference in core centralities between a leader and a captain in a given captain network (with vertices) by varying number of captains in each group () and leaders ().

6.2 Captain Networks and Core Centrality

In section 5, we prove that the core centrality of the leaders can be hidden by the captains in the captain networks [WMRW17] (Theorem 8 and Corollary 9). We empirically evaluate the core centralities of the the leaders and the captains by varying two parameters: the number of captains () in each group and the number of leaders () for network with multiple leaders.

Figure 3 presents the results for a captain network with vertices. For every pair of two parameters ( and ), we compute the maximum difference in core centrality between any leader and any captain. The intensity of the color signifies that the difference is lesser. Lesser difference also implies lesser disguise for the leaders. Low values of result into lower disguise for a leader. On the other hand, a high value of (large number of captains in each group) with low values of produces the maximum amount of disguise. But for a high value of , if the number of leaders are also very high, i.e., high , the amount of disguise for the leaders decreases.

We summarize our experimental findings as follows.

  • [itemsep=.2cm]

  • HLDA produces near optimal results in practice, where as, Random cannot produce high quality results. HLDA and LB need more edges to satisfy the degree requirements for followers in BA due to the scale-free degree distribution.

  • A captain network with small number of leaders (low ) and a large number of captains in each group (high value of ) produces the maximum amount of disguise.

  • A low value of , i.e., a small number of captains in each group yields lower disguise for core centrality which is not true for other centralities such as degree, closeness, and betweenness [WMRW17].

7 Conclusion and Future Work

We have shown that the Hiding Leader problem for the core centrality measure is -hard to approximate with a factor of for any constant for optimizing the number of edges one needs to add even when the core centrality of every leader is only . On the other hand, we prove that the Hiding Leader leader problem for degree centrality is polynomial time solvable if the degree of every leader is . Moreover, we also provide a factor polynomial time approximation algorithm for the Hiding Leader problem for optimizing the number of edges one needs to add to hide all the leaders. Hence, our results prove that, although classical complexity theoretic framework fails to compare relative difficulty of hiding leaders with respect to various centrality measures [WMRW17], hiding leaders may be significantly harder for the core centrality than the degree centrality. We complement our factor approximation algorithm for the Hiding Leader problem for degree centrality by proving that if there exists a factor approximation algorithm for the Hiding Leader problem for degree centrality for any constant , then there exists a factor approximation algorithm for the famous Densest -Subgraph problem which would be considered a major break through. The current best polynomial time algorithm for the Densest -Subgraph problem achieves an approximation ratio of only  [BCV12]. We have also empirically evaluated our approximation algorithm which shows that our algorithm produces an optimal solution for most of the cases. We have also shown that the captain networks proposed in [WMRW17] can hide the leaders with respect to core centrality.

An important future direction is to explore the average case computational complexity of the Hiding Leader problem for popular network centrality measures. Since the results of Waniek et al. [WMRW17] and ours establish that the Hiding Leader

problem is intractable only in the worst case, it could very well be possible that there exist heuristics that efficiently solve most randomly generated instances. If this is true, then the apparent complexity shield against manipulating various centrality measures will become substantially weak. Another immediate future work is to resolve the computational complexity of the

Hiding Leader problem for the core centrality measure when the core centrality of every leader is at most .

Acknowledgement

Dey is funded by DST INSPIRE grant no. 04/2016/001479 and IIT Kharagpur grant no. IIT/SRIC/CS/VTS/2018-19/247.

References

  • [AADF17] Scott Atran, Robert Axelrod, Richard Davis, and Baruch Fischhoff. Challenges in researching terrorism from the field. Science, 355(6323):352–354, 2017.
  • [AAE13] Yaniv Altshuler, Nadav Aharony, Yuval Elovici, Alex Pentland, and Manuel Cebrian. Stealing reality: when criminals become data scientists (or vice versa). In Security and Privacy in Social Networks, pages 133–151. Springer, 2013.
  • [ABD15] Eitan Y Alimi, Lorenzo Bosi, and Chares Demetriou. The dynamics of radicalization: a relational and comparative perspective. Oxford University Press, 2015.
  • [Ant71] Jac M Anthonisse. The rush in a graph. Amsterdam: Mathematische Centrum, 1971.
  • [BA99] Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999.
  • [Bas69] Frank M Bass. A new product growth for model consumer durables. Manag. Sci., 15(5):215–227, 1969.
  • [Bav48] Alex Bavelas. A mathematical model for group structures. Applied anthropology, 7(3):16–30, 1948.
  • [BCV12] Aditya Bhaskara, Moses Charikar, Aravindan Vijayaraghavan, Venkatesan Guruswami, and Yuan Zhou. Polynomial integrality gaps for strong SDP relaxations of densest k-subgraph. In Proc. 23-rd Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 388–405, 2012.
  • [Bea65] Murray A Beauchamp. An improved index of centrality. Behavioral science, 10(2):161–163, 1965.
  • [BF93] Wayne E Baker and Robert R Faulkner. The social organization of conspiracy: Illegal networks in the heavy electrical equipment industry. Am. Sociol. Rev, pages 837–860, 1993.
  • [BFCB15] Roberta Belli, Joshua D Freilich, Steven M Chermak, and Katharine A Boyd. Exploring the crime–terror nexus in the united states: a social network analysis of a hezbollah network involved in trade diversion. Dynamics of Asymmetric Conflict, 8(3):263–281, 2015.
  • [BKL15] Kshipra Bhawalkar, Jon Kleinberg, Kevin Lewi, Tim Roughgarden, and Aneesh Sharma. Preventing unraveling in social networks: the anchored k-core problem. SIAM J. Discrete Math., 29(3):1452–1475, 2015.
  • [BKRW17] Mark Braverman, Young Kun-Ko, Aviad Rubinstein, and Omri Weinstein. ETH hardness for densest-k-subgraph with perfect completeness. In Proc. 28-th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1326–1341, 2017.
  • [Bra01] Ulrik Brandes. A faster algorithm for betweenness centrality. Journal of mathematical sociology, pages 163–177, 2001.
  • [Cal17] David Calvey. Covert research: The art, politics and ethics of undercover fieldwork. Sage, 2017.
  • [CAX05] Hsinchun Chen, Homa Atabakhsh, Jennifer Jie Xu, Alan Gang Wang, Byron Marshall, Siddharth Kaza, Lu Chunju Tseng, Shauna Eggers, Hemanth Gowda, Tim Petersen, et al. Coplink center: social network analysis and identity deception detection for law enforcement and homeland security intelligence and security informatics: a crime data mining approach to developing border safe research. In Proc. 2005 National Conference on Digital government research, pages 112–113. Digital Government Society of North America, 2005.
  • [CDSV15] Pierluigi Crescenzi, Gianlorenzo D’Angelo, Lorenzo Severini, and Yllka Velaj. Greedily improving our own centrality in a network. In Proc. 14th Symposium on Experimental Algorithms, pages 43–55, 2015.
  • [CEHS12] Nick Crossley, Gemma Edwards, Ellen Harries, and Rachel Stevenson. Covert social movement networks and the secrecy-efficiency trade off: The case of the uk suffragettes (1906–1914). Soc. Networks, 34(4):634–644, 2012.
  • [CLWU13] Peter Csermely, András London, Ling-Yun Wu, and Brian Uzzi. Structure and dynamics of core/periphery networks. Journal of Complex Networks, 1(2):93–123, 2013.
  • [CSW05] Peter J Carrington, John Scott, and Stanley Wasserman. Models and methods in social network analysis, volume 28. Cambridge university press, 2005.
  • [DK12] Fatih Demiroz and Naim Kapucu. Anatomy of a dark network: the case of the turkish ergenekon terrorist organization. Trends in organized crime, 15(4):271–295, 2012.
  • [DKN19] Palash Dey, Pravesh K. Kothari, and Swaprava Nath. The social network effect on surprise in elections. In

    Pro. ACM India Joint International Conference on Data Science and Management of Data, 24th COMAD, 6th CODS, Kolkata, India, January 3-5, 2019

    , pages 1–9, 2019.
  • [DKS14] Paul AC Duijn, Victor Kashirin, and Peter MA Sloot. The relative ineffectiveness of criminal network disruption. Scientific reports, 4:4238, 2014.
  • [DLG11] Bistra Dilkina, Katherine J. Lai, and Carla P. Gomes. Upgrading shortest paths in networks. In

    Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems

    , pages 76–91. Springer, 2011.
  • [DSV16] Gianlorenzo D’Angelo, Lorenzo Severini, and Yllka Velaj. On the maximum betweenness improvement problem. Electronic Notes in TCS, 322:153 – 168, 2016.
  • [DZ10] E. D. Demaine and M. Zadimoghaddam. Minimizing the diameter of a network using shortcut edges. in SWAT, ser.Lecture Notes in Computer Science, H. Kaplan,Ed., pages 420–431, 2010.
  • [Eis18] HA Eiselt. Destabilization of terrorist networks. Chaos, Solitons & Fractals, 108:111–118, 2018.
  • [EJ10] Walter Enders and Paan Jindapon. Network externalities and the structure of terror networks. J. Confl. Resolut., 54(2):262–280, 2010.
  • [ES07] Walter Enders and Xuejuan Su. Rational terrorists and optimal network structure. J. Confl. Resolut., 51(1):33–57, 2007.
  • [FKB17] Ejaz Farooq, Shoab A Khan, and Wasi Haider Butt. Covert network analysis to detect key players using correlation and social network analysis. In Proc. 2nd International Conference on Internet of things and Cloud Computing, pages 94:1–94:6. ACM, 2017.
  • [Fre77] Linton C Freeman. A set of measures of centrality based on betweenness. Sociometry, pages 35–41, 1977.
  • [GJ79] Michael R Garey and David S Johnson. Computers and Intractability, volume 174. freeman New York, 1979.
  • [GLM01] Jacob Goldenberg, Barak Libai, and Eitan Muller. Using complex systems analysis to advance marketing theory development: Modeling heterogeneity effects on new product growth through stochastic cellular automata. J. Acad. Mark. Sci., 9(3):1–18, 2001.
  • [IETB12] Vatche Ishakian, Dóra Erdos, Evimaria Terzi, and Azer Bestavros. A framework for the evaluation and management of network centrality. In Proc. SIAM International Conference on Data Mining, pages 427–438, 2012.
  • [JM12] RHP Janssen and Herman Monsuur. Stable network topologies using the notion of covering. Eur J Oper Res., 218(3):755–763, 2012.
  • [JZV16] Neil F Johnson, M Zheng, Yulia Vorobyeva, Andrew Gabriel, Hong Qi, Nicolás Velásquez, Pedro Manrique, Daniela Johnson, E Restrepo, C Song, et al. New online ecology of adversarial aggregates: Isis and beyond. Science, 352(6292):1459–1463, 2016.
  • [KGH10] Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, H Eugene Stanley, and Hernán A Makse. Identification of influential spreaders in complex networks. Nature physics, 6(11):888, 2010.
  • [Kil12] Joshua Kilberg. A basic model explaining terrorist group organizational structure. Studies in Conflict & Terrorism, 35(11):810–830, 2012.
  • [KKT03] David Kempe, Jon Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proc. 9th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137–146. ACM, 2003.
  • [Kno15] David Knoke. Emerging trends in social network analysis of terrorism and counterterrorism. Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary, Searchable, and Linkable Resource, pages 1–15, 2015.
  • [Kre02] Valdis E Krebs. Mapping networks of terrorist cells. Connections, 24(3):43–52, 2002.
  • [LBH09] Roy Lindelauf, Peter Borm, and Herbert Hamers. The influence of secrecy on the communication structure of covert networks. Soc. Networks, 31(2):126–137, 2009.
  • [LLPC10] Yong Lu, Xin Luo, Michael Polgar, and Yuanyuan Cao. Social network analysis of a criminal hacker community. J. Comp. Inf. Sys, 51(2):31–41, 2010.
  • [LM15] Yimin Lin and Kyriakos Mouratidis. Best upgrade plans for single and multiple source-destination pairs. GeoInformatica, 19(2):365–404, 2015.
  • [LT08] Kun Liu and Evimaria Terzi. Towards identity anonymization on graphs. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 93–106. ACM, 2008.
  • [MBS18] Sourav Medya, Petko Bogdanov, and Ambuj Singh. Making a small world smaller: Path optimization in networks. IEEE Transactions on Knowledge and Data Engineering, 30(8):1533–1546, 2018.
  • [MCU16] Ahmad Mahmoody, E Charalampos, and Eli Upfal. Scalable betweenness centrality maximization via sampling. In 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1765–1773, 2016.
  • [Mem12] Bisharat Rasool Memon. Identifying important nodes in weighted covert networks using generalized centrality measures. In Proc. European Intelligence and Security Informatics Conference, pages 131–140. IEEE, 2012.
  • [MGP07] Carlo Morselli, Cynthia Giguère, and Katia Petit. The efficiency/security trade-off in criminal networks. Soc. Networks, 29(1):143–153, 2007.
  • [MI06] Nigel Meade and Towhidul Islam. Modelling and forecasting the diffusion of innovation–a 25-year review. Int. J. Forecast, 22(3):519–545, 2006.
  • [Mos15] Dana Moshkovitz. The projection games conjecture and the np-hardness of ln n-approximating set-cover. Theory Comput., 11:221–235, 2015.
  • [MSS18] Sourav Medya, Arlei Silva, Ambuj Singh, Prithwish Basu, and Ananthram Swami. Group centrality maximization via network design. In Proc. 24th SIAM International Conference on Data Mining, pages 126–134. SIAM, 2018.
  • [MT09] Adam Meyerson and Brian Tagiku. Minimizing average shortest path distances via shortcut edge addition. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX-RANDOM), pages 272–285. Springer, 2009.
  • [MVRS18] Sourav Medya, Jithin Vachery, Sayan Ranu, and Ambuj Singh. Noticeable network delay minimization via node upgrades. Proceedings of the VLDB Endowment, 11(9):988–1001, 2018.
  • [OR02] Evelien Otte and Ronald Rousseau. Social network analysis: a powerful strategy, also for the information sciences. J. Inf. Sci., 28(6):441–453, 2002.
  • [PBG11] Manos Papagelis, Francesco Bonchi, and Aristides Gionis. Suggesting ghost edges for a smaller world. In International conference on Information and knowledge management (CIKM), pages 2305–2308, 2011.
  • [PPT15] N Parotisidis, Evaggelia Pitoura, and Panayiotis Tsaparas. Selecting shortcuts for a smaller world. In SIAM International Conference on Data Mining (SDM), pages 28–36. SIAM, 2015.
  • [PPT16] Nikos Parotsidis, Evaggelia Pitoura, and Panayiotis Tsaparas. Centrality-aware link recommendations. In Proc. 9th International ACM Conference on Web Search and Data Mining, pages 503–512, 2016.
  • [PS95] D. Paik and S. Sahni. Network upgrading problems. Networks, pages 45–58, 1995.
  • [RE16] Nancy Roberts and Sean Everton. Monitoring and disrupting dark networks: A bias toward the center and what it costs us. In Eradicating Terrorism from the Middle East, pages 29–42. Springer, 2016.
  • [Res06] Steve Ressler. Social network analysis as an approach to combat terrorism: Past, present, and future research. Homeland Security Affairs, 2(2), 2006.
  • [RK14] Matteo Riondato and Evgenios M Kornaropoulos. Fast approximation of betweenness centrality through sampling. In Proc. 7th International ACM Conference on Web Search and Data Mining, pages 413–422, 2014.
  • [Sag04] Marc Sageman. Understanding terror networks. University of Pennsylvania Press, 2004.
  • [SC14] Rachel Stevenson and Nick Crossley. Change in covert social movement networks: The ‘inner circle’of the provisional irish republican army. Soc. Mov. Stud., 13(1):70–91, 2014.
  • [Sei83] Stephen B Seidman. Network structure and minimum degree. Soc. networks, 5(3):269–287, 1983.
  • [Sha54] Marvin E Shaw. Group structure and the behavior of individuals in small groups. The Journal of psychology, 38(1):139–149, 1954.
  • [SJ08] Muhammad Akram Shaikh and Wang Jiaxin. Network structure mining: locating and isolating core members in covert terrorist networks. WSEAS Transactions on Information Science and Applications, 5(6):1011–1020, 2008.
  • [Spa91] Malcolm K Sparrow. The application of network analysis to criminal intelligence: An assessment of the prospects. Soc. Networks, 13(3):251–274, 1991.
  • [SS02] Jordi Sabater and Carles Sierra. Reputation and social network analysis in multi-agent systems. In Proc. 1st International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, pages 475–482, 2002.
  • [VL15] Klaus Von Lampe. Organized crime: analyzing illegal activities, criminal structures, and extra-legal governance. Sage Publications, 2015.
  • [WCZM07] Fei-Yue Wang, Kathleen M Carley, Daniel Zeng, and Wenji Mao. Social computing: From social informatics to social intelligence. IEEE Intelligent systems, 22(2), 2007.
  • [WMRW17] Marcin Waniek, Tomasz P. Michalak, Talal Rahwan, and Michael Wooldridge. On the construction of covert networks. In Proc. 16th Conference on Autonomous Agents and MultiAgent Systems, AAMAS, pages 1341–1349, 2017.
  • [WMWR18] Marcin Waniek, Tomasz P Michalak, Michael J Wooldridge, and Talal Rahwan. Hiding individuals and communities in a social network. Nature Human Behaviour, 2(2):139, 2018.
  • [WS98] Duncan J Watts and Steven H Strogatz. Collective dynamics of ‘small-world’networks. Nature, 393(6684):440, 1998.
  • [XC05] Jennifer Xu and Hsinchun Chen. Criminal network analysis and visualization. Commun. ACM, 48(6):100–107, 2005.
  • [Yos14] Yuichi Yoshida. Almost linear-time algorithms for adaptive betweenness centrality using hypergraph sketches. In Proc. 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1416–1425, 2014.
  • [ZG09] Elena Zheleva and Lise Getoor. To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles. In Proceedings of the 18th International Conference on World Wide Web, pages 531–540. ACM, 2009.