Novel Dense Subgraph Discovery Primitives: Risk Aversion and Exclusion Queries

In the densest subgraph problem, given a weighted undirected graph G(V,E,w), with non-negative edge weights, we are asked to find a subset of nodes S⊆ V that maximizes the degree density w(S)/|S|, where w(S) is the sum of the edge weights induced by S. This problem is a well studied problem, known as the densest subgraph problem, and is solvable in polynomial time. But what happens when the edge weights are negative? Is the problem still solvable in polynomial time? Also, why should we care about the densest subgraph problem in the presence of negative weights? In this work we answer the aforementioned question. Specifically, we provide two novel graph mining primitives that are applicable to a wide variety of applications. Our primitives can be used to answer questions such as "how can we find a dense subgraph in Twitter with lots of replies and mentions but no follows?", "how do we extract a dense subgraph with high expected reward and low risk from an uncertain graph"? We formulate both problems mathematically as special instances of dense subgraph discovery in graphs with negative weights. We study the hardness of the problem, and we prove that the problem in general is NP-hard. We design an efficient approximation algorithm that works well in the presence of small negative weights, and also an effective heuristic for the more general case. Finally, we perform experiments on various real-world uncertain graphs, and a crawled Twitter multilayer graph that verify the value of the proposed primitives, and the practical value of our proposed algorithms. The code and the data are available at <https://github.com/negativedsd>.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset