Understanding the Limitations of Network Online Learning

01/09/2020
by   Timothy LaRock, et al.
0

Studies of networked phenomena, such as interactions in online social media, often rely on incomplete data, either because these phenomena are partially observed, or because the data is too large or expensive to acquire all at once. Analysis of incomplete data leads to skewed or misleading results. In this paper, we investigate limitations of learning to complete partially observed networks via node querying. Concretely, we study the following problem: given (i) a partially observed network, (ii) the ability to query nodes for their connections (e.g., by accessing an API), and (iii) a budget on the number of such queries, sequentially learn which nodes to query in order to maximally increase observability. We call this querying process Network Online Learning and present a family of algorithms called NOL*. These algorithms learn to choose which partially observed node to query next based on a parameterized model that is trained online through a process of exploration and exploitation. Extensive experiments on both synthetic and real world networks show that (i) it is possible to sequentially learn to choose which nodes are best to query in a network and (ii) some macroscopic properties of networks, such as the degree distribution and modular structure, impact the potential for learning and the optimal amount of random exploration.

READ FULL TEXT
research
04/19/2018

Exploring Partially Observed Networks with Nonparametric Bandits

Real-world networks such as social and communication networks are too la...
research
01/17/2020

Predictability limit of partially observed systems

Applications from finance to epidemiology and cyber-security require acc...
research
06/09/2021

Multi-layered Network Exploration via Random Walks: From Offline Optimization to Online Learning

Multi-layered network exploration (MuLaNE) problem is an important probl...
research
09/16/2019

Deep Reinforcement Learning for Task-driven Discovery of Incomplete Networks

Complex networks are often either too large for full exploration, partia...
research
10/25/2021

Sampling Multiple Nodes in Large Networks: Beyond Random Walks

Sampling random nodes is a fundamental algorithmic primitive in the anal...
research
03/13/2019

Online Budgeted Learning for Classifier Induction

In real-world machine learning applications, there is a cost associated ...
research
03/15/2017

Selective Harvesting over Networks

Active search (AS) on graphs focuses on collecting certain labeled nodes...

Please sign up or login with your details

Forgot password? Click here to reset