Near-Optimal Data Source Selection for Bayesian Learning

11/21/2020
by   Lintao Ye, et al.
0

We study a fundamental problem in Bayesian learning, where the goal is to select a set of data sources with minimum cost while achieving a certain learning performance based on the data streams provided by the selected data sources. First, we show that the data source selection problem for Bayesian learning is NP-hard. We then show that the data source selection problem can be transformed into an instance of the submodular set covering problem studied in the literature, and provide a standard greedy algorithm to solve the data source selection problem with provable performance guarantees. Next, we propose a fast greedy algorithm that improves the running times of the standard greedy algorithm, while achieving performance guarantees that are comparable to those of the standard greedy algorithm. We provide insights into the performance guarantees of the greedy algorithms by analyzing special classes of the problem. Finally, we validate the theoretical results using numerical examples, and show that the greedy algorithms work well in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2017

Scalable Greedy Feature Selection via Weak Submodularity

Greedy algorithms are widely used for problems in machine learning such ...
research
03/09/2020

Distributed Submodular Maximization with Parallel Execution

The submodular maximization problem is widely applicable in many enginee...
research
02/19/2011

Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection

We study the problem of selecting a subset of k random variables from a ...
research
06/08/2019

apricot: Submodular selection for data summarization in Python

We present apricot, an open source Python package for selecting represen...
research
04/07/2020

The Impact of Message Passing in Agent-Based Submodular Maximization

Submodular maximization problems are a relevant model set for many real-...
research
11/29/2017

Near-optimal irrevocable sample selection for periodic data streams with applications to marine robotics

We consider the task of monitoring spatiotemporal phenomena in real-time...
research
02/06/2018

How to select the best set of ads: Can we do better than Greedy Algorithm?

Selecting the best set of ads is critical for advertisers for a given se...

Please sign up or login with your details

Forgot password? Click here to reset