Exploring Partially Observed Networks with Nonparametric Bandits

04/19/2018 ∙ by Kaushalya Madhawa, et al. ∙ 0

Real-world networks such as social and communication networks are too large to be observed entirely. Such networks are often partially observed such that network size, network topology, and nodes of the original network are unknown. In this paper we formalize the Adaptive Graph Exploring problem. We assume that we are given an incomplete snapshot of a large network and additional nodes can be discovered by querying nodes in the currently observed network. The goal of this problem is to maximize the number of observed nodes within a given query budget. Querying which set of nodes maximizes the size of the observed network? We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-arm bandit (MAB) algorithm for identifying which nodes to be queried. Our contributions include: (1) iKNN-UCB, a novel nonparametric MAB algorithm, applies k-nearest neighbor UCB to the setting when the arms are presented in a vector space, (2) provide theoretical guarantee that iKNN-UCB algorithm has sublinear regret, and (3) applying iKNN-UCB algorithm on synthetic networks and real-world networks from different domains, we show that our method discovers up to 40 compared to existing baselines.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Interactions among different entities in many real-world complex systems are often represented by networks, where the entities are represented by nodes and the interactions among them are represented as links between entities. For example, the information contained in online social networks proved to be valuable in advertising applications such as finding influential users to targeted marketing. Data acquisition is done using Application Programming Interfaces (APIs) offered by respective social networking services. Using these APIs is often time consuming and the number of nodes (e.g., profiles) that can be queried within a given time is restricted. A poorly constructed incomplete network will lead to inaccurate findings. This highlights the importance of acquiring more information as possible using a limited number of queries.

Here, we provide an overview of Adaptive Graph Exploration problem. We formally define it in section 3. Suppose we are given a partially observed network. For instance, a sample of a social network collected by a researcher. Since we do not know how this sample is obtained, only way to enhance this sample is by acquiring data belonging to the unseen portion of the network. We use the term probing to refer to querying a node to retrieve information about it and its neighborhood. As an example, probing a node of a social network corresponds to obtaining information about a profile and its friends (or followers) using an API or a web service. Several rounds of probing updates the sample with new nodes and links found in the neighborhood of queried nodes. The number of times the network can be probed is restricted by a probing budget. Thus, the goal is to enhance the observed graph as much as possible within the probing budget.

Two approaches have been proposed to solve the problem of reducing the incompleteness of partially observed networks. First approach involves inferring properties of the unseen part of the network using knowledge of the sample. Such methods infers the missing information by fitting a model of network structure to the observed part [kim2011network]

. However, this is not practical for real-world networks as such methods require more structural information about the complete network. Second approach is acquiring more information by probing as we propose in this paper. Existing heuristic algorithms such as maximum observed degree (MOD) probing and maxreach

[soundarajan2016maxreach] require the sample to be obtained in a certain way (e.g., uniform edge sampling). In section 4 we show that existing probing algorithms can not be generalized for incomplete networks obtained by different sampling techniques. Furthermore, many real world networks consist of communities, densely connected regions of nodes. Heuristic probing algorithms get stuck inside communities, making them worse than probing a node in random.

Our Work.

A high level overview of the proposed adaptive probing algorithm is illustrated in Figure 1. The probing pipeline consists of two major steps, obtaining a feature representation of the observed network and a model which predicts the reward a node will reveal (e.g., the true degree of that node) based on its feature vector. The key assumption of using a learning model is that nodes with similar features in the observed network will result in similar rewards. Our choice of graph features is motivated by work on inferring structural role [henderson2012rolx] and social status [zhao2013inferring] of nodes in social networks.

Figure 1: prediction pipeline

One property which makes estimation of rewards different from a normal prediction problem is that our training data is accumulated over the process of probing. Probing nodes with similar features all the time may result in sub-optimal results. This situation is known in reinforcement learning literature as

exploration-exploitation trade off. Multi-armed bandits [robbins1952some] is a generic way to approach real-world exploitation-exploration problems. In this context, exploitation corresponds to selecting the node which has the largest expected reward and exploration corresponds to selecting some other node for probing.

Our contributions are threefold:

  1. A generic approach for enhancing partially observed networks which does not require any prior knowledge about the network.

  2. A novel non-paramteric UCB algorithm (KNN-UCB) to solve the multi-armed bandit problem (MAB) when the arms are represented in a vector space. 111source code available at https://bitbucket.org/kau_mad/bandits/src/pkdd2018/

  3. Using KNN-UCB algorithm on synthetic networks and real-world networks from different domains, we demonstrate that our proposed method performs significantly better than existing methods. 222source code available at https://bitbucket.org/kau_mad/net_complete/src/pkdd2018

The rest of the paper is structured as following. In section 2, we provide an extensive review of related work. section 3 starts with the problem definition and describes our approach in detail. section 4 explains the experimental setup and the data sets being used. Then, in section 5 we present empirical evaluations of our bandit algorithm using real-world networks as well as synthetic networks. Finally, section 6 concludes with a brief discussion of the bandit approach and a few promising directions as future work.

2 Related Work

2.1 Network Crawling and Sampling

Although this problem looks similar to network crawling and sampling, objective of most sampling algorithms is to select a representative subset of the nodes (or edges) when the entire network is accessible [ahmed2014network]. In contrast, we are improving a given incomplete network and we have no knowledge of how the sample is being obtained. Particularly, snowball sampling [lee2006statistical] can be used when the information about the complete network is not accessible. But it suffers from the same drawbacks as of heuristic algorithms; it does not adapt as the observed information updates. As another related problem, link prediction [liben2007link] can predict missing links on a network, but not missing regions of nodes. The only way to enhance the observed sample is by iteratively querying observed nodes and adding their neighboring nodes to the sample.

2.2 Active Search

Active search on graphs [wang2013active, bilgic2010active] is another related problem with the objective of finding as much target nodes as possible possessing a given property. Most of the previous work relating to this problem assume that the complete graph is observable and any node can be queried to find its label [ma2015active]. If only an incomplete view is available, relying only on the observed information may not obtain the best possible reward. In addition to exploitation of the best option according to available information, exploration of other possible options is performed to achieve better rewards. A common approach to finding a balance between exploitation vs exploration trade-off is formulating it as a multi-armed bandit problem (MAB) [mahajan2008multi]. SN-UCB1[bnaya2013social] and NETEXP[singla2015information] are such MAB based active search algorithms proposed for partially observed networks. Probing a node in NETEXP reveals 2-hop neighborhood, which is not true for real world social networks. SN-UCB1 does not provide a significant improvement over the existing heuristic methods. soundarajan2017varepsilon recently proposed -WGX, a multi-armed bandit approach to solve Active Edge Probing (AEP) problem in incomplete networks. Though AEP looks similar, it is fundamentally different from ours as a node can be probed multiple times and only one neighboring edge is revealed in each probe.

3 Proposed Bandit Based Probing Method

We start this section with the formal definition of the problem. Then we describe the main components of this work and the multi-armed bandit algorithm in detail.

3.1 Problem Definition

Suppose there is a large unweighted undirected graph which can not be observed fully, but only a partially observed network is available. We denote the initial incomplete network as . Our goal is to grow this network by probing any of the observed nodes at each time step. Using this notation we denote the observed network at time as . Table 1 lists the notation that we will be using in this section.

Symbol Definition
original network
observed network at time
set of candidate nodes at time
probing budget
Table 1: Table of notations
Definition 1.

Probing a node reveals all links incident to it and the identity of its neighboring nodes.

The number of times we are allowed to probe the network is constrained by the probing budget ()

Figure 2: Example of an incomplete network. The black node is probed and gray nodes are observed. The white nodes exist in the original network , are yet to be observed.
Definition 2.

At time , a node in the original network can belong to any of the following three sets.

  1. unobserved: existence of these nodes is not visible to the algorithm.

  2. observed: these nodes exist in both and , but has not being probed.

  3. probed: the algorithm knows about these nodes and their neighboring nodes.

Figure 2

illustrates an example incomplete network. We use bold lines to denote observed links and dash lines to denote unobserved links at the given moment. Even though nodes

and are observed when node is probed, [] link is not observed because neither nodes are probed.

An observed node can either be probed or not probed at the moment. Any observed node which is not probed is considered as a candidate for probing. Hence, we refer such nodes as candidate nodes. At the beginning, all the nodes in the given sample are candidate nodes. Probing a candidate node reveals a reward (eg. true degree of a node). Our goal is iteratively selecting b candidate nodes that maximizes the cumulative reward (i.e., number of observed nodes).

3.2 Calculation of expected reward of candidate nodes

Instead of using a heuristic metric to choose a candidate node for probing in each time step, we treat this problem as a learning problem. Similar to an active exploration algorithm, our proposed solution consists of three high level steps [pfeiffer2014active]: probing, learning, and prediction. Probing a node results in additional information about the observed network. Information about the currently observed network is leveraged to learn a predictive model which predicts the expected reward of a given candidate node in future. Our approach assumes that candidate nodes with similar structural neighborhoods will result in similar rewards.

Suppose that the feature vector of a candidate node at time is . The learner probes node at time and observes the following reward

where gives the expected reward of a given node and

is sub-gaussian white noise with mean 0 and variance


Assumption 1.

(Lipschitz condition): There exists a constant such that for all , . is a metric which defines the “distance” between two vectors and .

Assumption 1 expresses that nodes which are similar in terms of their feature vectors will have similar rewards. In the next section, we describe in detail how we formulate this problem as a multi-armed bandit problem.

3.3 Bandit Algorithm

3.3.1 Problem Setting

In the classical contextual multi-armed bandit problem, an agent selects one of the

arms (or actions) at each time step and observes a reward depending on the chosen action. In this setting, each arm is assumed to be independent, the rewards are drawn randomly from a probability distribution that is specific to each arm. The goal of the agent is to play a sequence of actions which maximizes the cumulative reward it receives within a given number of time steps.

Selecting a node from the set of candidate nodes at time step for probing is similar to pulling an arm in a multi-armed bandit problem. However, the classical notion of K-armed bandit problem assumes that the set of arms would not change over time and requires each arm to be played several times. In contrast, the set of candidate nodes change as probings occur over time. And more importantly, a node can not be probed for a second time.

As independent assumption does not hold in our problem setting, it is more suitable to express it as a structured bandits problem, in which reward distributions of arms are not independent, but interrelated. In structured bandit problem, the agent deduces relationship between arms based on some -dimensional feature vector assigned to an arm .

3.3.2 KNN-UCB algorithm for structured bandits

Linear bandits[rusmevichientong2010linearly, Dani2008] the simplest among such models, assumes the reward is linearly dependent on feature vectors and computes the expected reward of an arm by the inner product of its feature vector and a parameter vector . But real data often exhibits more complicated relationships than a linear one. Hence, we choose -nearest neighbor (k-NN) regression to estimate the expected reward of arms. We adapt guan2018nonparametric’s k-armed KNN-UCB algorithm to the structured setting. Upper confidence bound [auer2002using] (UCB) algorithms incorporate an exploration term by calculating a confidence bound for each arm and choose the action corresponding to the largest confidence bound.

We define -nearest neighbor upper confidence bound (KNN-UCB) rule as


where is a constant determining the amount of exploration.

Definition 3.

Let the -NN radius of be where . -NN set of be . Expected reward of arm , is estimated with weighted -NN regression as


where is the observed reward for and is the euclidean distance between feature vectors and .

We define as the average distance to points in the k-neighborhood,


The term is analogous to the term accounting for the number of times action has been chosen by the time . The way the network is being probed using KNN-UCB is shown in algorithm 1.

Input : incomplete network , probing budget , exploration parameter , ,
Output : A sequence of nodes to probe
Initialize: candidate nodes =
1 for  to  do
2       if  then
3            sample uniformly from
4       else
5             for  in candidate nodes do
6                   calculate the feature vector calculate the estimated reward with eq. 2 calculate exploration term with eq. 3
7            find the node corresponding to the largest UCB with eq. 1
8       probe node in the original graph G and observe the reward Add neighboring nodes of node to the incomplete network remove node from candidate nodes
Algorithm 1 KNN-UCB.

3.3.3 Regret

The objective of a bandit algorithm is to select arms so as to maximize the cumulative reward over time. Minimization of total regret, is an equivalent way of expressing maximization of cumulative reward. The regret at iteration equals to the difference between reward of the “optimal” arm and the reward of a suboptimal arm. In simple terms, regret is the loss incurred by the policy for not playing the optimal arm all the times. In iterations, we pull arms and we observe rewards . We use the following notion of regret

Theorem 3.1.

Let be an arbitrary constant. Then the regret is sublinear with, .


The regret for bandits in a continuous feature space is


Let be

Using Lipschitz assumption


From jiang2017rates,


where is a constant. Using this in eq. 6 results in




Hence, the regret is sub-linear. ∎

Remark 1.

If we select , we can write eq. 5 as


4 Experiments

We construct the feature vector of candidate node as a vector of following features. For each feature, the local neighborhood of node in the observed graph is considered.

  1. degree centrality

  2. average degree centrality of its neighbors

  3. median degree centrality of its neighbors

  4. the average percentage of probed neighbors found in the neighborhood

These features are chosen because their effectiveness is shown in previous work on finding structurally similar nodes [henderson2012rolx].

4.1 Data

We use simulated network data as well as publicly available333http://snap.stanford.edu/data/index.html real-world data sets of social and information networks.

4.1.1 Synthetic data.

The aim of using synthetic networks is to investigate the behavior of the proposed method on networks with different network configurations. We use two random network models, Barabasi-Albert model (BA) [barabasi1999emergence] and Lancichinetti-Fortunato-Radicchi (LFR) [lancichinetti2008benchmark] benchmark to create networks with different characteristics. All these networks have the same number of nodes (, the number of nodes in the HepPh citation network. BA model generates networks with power-law degree distributions. But real-world communication networks possess different properties such as homophily [mcpherson2001birds] which can not be represented by a BA model. We use LFR model to generate networks with community structure. The mixing parameter of LFR model decides the probability of a node linking other nodes belonging to different communities. Low values of will result in dense communities as the chance of having intra-community links () is higher compared to the chance of inter-community links (). We created LRF benchmark networks with varying the value of in the range [0.1, 0.5] to investigate the impact of underlying community structure of a network on our method.

4.1.2 Real-world data.

Table 2 gives a summary of the seven real-world network data sets we use. In citation networks, if a paper cites another paper , the network contains an undirected edge connecting paper and paper . Similarly, co-authorship networks represent authors as nodes and two authors are connected if they have published at least one paper together. Nodes of the network Enron-email are email addresses of Enron employees. If user has sent at least one email to the user , nodes and are connected by an undirected edge. Twitter data set is made of 1000 ego-networks consisting of 4,869 Twitter lists [leskovec2012learning]. Epinions, and Slashdot can be considered as web of trust networks. Even though Epinion and Slashdot networks are often labeled as online social networks, they differ from the usual notion of social networks as they represent who-trust-whom data of users instead of the relationships or interaction among users. In these networks, a user tags another user as trustworthy or not. They are sparse compared to online social networks.

HepPh HepTh Epinions Twitter Stanford AstroPh DBLP Slashdot
Type citation citation web social web CA CA web
Nodes 34,546 27,770 75,789 81,306 281,903 18,772 317,080 82,168
Edges 421,578 352,807 508,837 1,768,149 2,312,497 198,110 1,049,866 549,202
Avg Clustering 0.2848 0.3120 0.1378 0.5653 0.5976 0.6306 0.6324 0.0603
Table 2: Description of data sets. (CA = co-authorship)

4.2 Impact of Initial Sampling Method

To investigate how the sampling method used to acquire the initial sample influence the probing methods, we generate graph samples using two sampling methods. These are the methods we use:

  1. Random node sampling (RN): At each step we choose one neighbor of a node already in the sample.

  2. Breadth-first search (BFS): Nodes are added to the sample in the order they are observed.

4.3 Methods

We compare the performance of our algorithm against the following algorithms.

4.3.1 Algorithms that do not use node features

  • Random walk (RW). In this trivial baseline, we select one of the candidate nodes randomly for probing. This is equivalent to running our Bandit Explorer algorithm with only one cluster and using the random strategy for node selection.

  • Maximum observed degree (MOD). This greedy method proposed in [avrachenkov2014pay] is the current state-of-the-art algorithm for finding the network cover in an online manner.

4.3.2 Algorithms that use node features

  • Lin-UCB. This applies the UCB algorithm by Dani2008 assuming that the reward of an arm is linearly dependent on its feature vector.

  • KNN-greedy. This algorithm chooses the arm corresponding to the largest expected reward calculated by k-NN model.

  • KNN--greedy. This algorithm chooses a random arm with probability while selecting the arm with k-NN regression selects the arm rest of the times.

  • KNN-UCB This is our proposed algorithm, algorithm 1.

5 Results

5.1 Analysis on Synthetic Networks

We probe incomplete BA and LFR networks obtained by RN and BFS sampling for 1,000 iterations (). Number of nodes observed in the BA network is shown in Figure 3. For all networks generated by Barabasi-Albert (BA) model, MOD could observe more nodes than bandit algorithm. This confirms avrachenkov2014pay’s claim that MOD probing can achieve the best connected network cover for networks generated by preferential attachment processes.

(a) (b)
Figure 3: Scale-free network created by Barabasi-Albert model. (nodes=50,000, m = 20) (a) random node (RN) sample (b) BFS sample
(a) (b)
Figure 4: Performance on synthetic networks generated by LFR benchmark (a) RN sample (b) BFS sample

To understand how the existence of community structure impacts the probing, we evaluate the performance of all algorithms on synthetic networks generated by different configurations of LFR benchmark model [lancichinetti2008benchmark]. We vary the mixing parameter from 0.1 to 0.5 keeping all other parameters of the model constant (, , average degree = 25). KNN-UCB significantly outperforms the baseline for networks with smaller . When the initial sample is obtained by BFS sampling, KNN-UCB outperforms all baselines by a significant margin. The gap between KNN-UCB and the baseline is larger when the mixing parameter is small, network has significant community structure. The experimental results on synthetic networks suggest that KNN-UCB algorithm can adapt for incomplete networks obtained by different sampling techniques and networks with structural properties such as community structure.

5.2 Results on Real World Networks

We use 8 real-world networks mentioned in Table 2 and generate RN and BFS samples containing 5% nodes of the original network . Then 1,000 probing steps are performed. We perform each experiment five times initialized with different random seeds and report the average number of additional nodes which were observed in Figure 5 and Figure 6.

Figure 5: Comparison against baselines: 1000 probes run on 5% nodes of each network. Each sample is created by performing a random walk on the original network
Figure 6: Comparison against baselines: 1000 probes run on 5% nodes of each network. Each sample is created by performing a breadth first walk on the original network

KNN-UCB and Lin-UCB bandit algorithms outperform all baseline methods in networks generated by both RN and BFS sampling. Even though Lin-UCB bandit algorithm observes as much nodes as KNN-UCB for RN samples, its performance is worse for BFS samples. This shows that linear model in Lin-UCB is not capable of learning the relationship between observed node features and the true degree of a node if the sample is constructed by a BFS.

6 Conclusions

In this paper, we introduced a bandit based exploration algorithm for partially observed incomplete networks. We proposed a novel nonparametric multi-armed bandit algorithm KNN-UCB with sublinear regret. Compared to existing solutions for the Adaptive Graph Exploring problem, the proposed method does not depend on a specific heuristic. Additionally, KNN-UCB bandit algorithm outperforms the baseline methods irrespective of how the initial incomplete network is obtained. We provided experimental evidence for our approach using synthetic networks and variety of real-world networks. Using different configurations of LFR benchmark networks, we observed that our algorithm outperforms all other baselines significantly when the network exhibits community structure prominently. Since the reward function is independent from the probing procedure, it is easy to define a new reward function to solve a different graph exploration problem (eg. finding a particular type of nodes).

In this problem, we assumed that probing a node would reveal all its neighboring nodes. However in some real-world scenarios, only a certain number of neighbors is revealed (e.g., follower limit in Twitter API 444https://dev.twitter.com/rest/reference/get/followers/ids). As future work, we would explore how this current approach can be changed for such different settings of the same problem.