Submodular Optimization Over Streams with Inhomogeneous Decays

11/14/2018
by   Junzhou Zhao, et al.
0

Cardinality constrained submodular function maximization, which aims to select a subset of size at most k to maximize a monotone submodular utility function, is the key in many data mining and machine learning applications such as data summarization and maximum coverage problems. When data is given as a stream, streaming submodular optimization (SSO) techniques are desired. Existing SSO techniques can only apply to insertion-only streams where each element has an infinite lifespan, and sliding-window streams where each element has a same lifespan (i.e., window size). However, elements in some data streams may have arbitrary different lifespans, and this requires addressing SSO over streams with inhomogeneous-decays (SSO-ID). This work formulates the SSO-ID problem and presents three algorithms: BasicStreaming is a basic streaming algorithm that achieves an (1/2-ϵ) approximation factor; HistApprox improves the efficiency significantly and achieves an (1/3-ϵ) approximation factor; HistStreaming is a streaming version of HistApprox and uses heuristics to further improve the efficiency. Experiments conducted on real data demonstrate that HistStreaming can find high quality solutions and is up to two orders of magnitude faster than the naive Greedy algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2018

Do Less, Get More: Streaming Submodular Maximization with Subsampling

In this paper, we develop the first one-pass streaming algorithm for sub...
research
08/06/2018

Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams

Many tasks in machine learning and data mining, such as data diversifica...
research
10/09/2020

Streaming Submodular Maximization with Fairness Constraints

We study the problem of extracting a small subset of representative item...
research
11/07/2017

Streaming Robust Submodular Maximization: A Partitioned Thresholding Approach

We study the classical problem of maximizing a monotone submodular funct...
research
03/20/2019

Distributed Maximization of "Submodular plus Diversity" Functions for Multi-label Feature Selection on Huge Datasets

There are many problems in machine learning and data mining which are eq...
research
02/13/2023

Maximum Coverage in Sublinear Space, Faster

Given a collection of m sets from a universe 𝒰, the Maximum Set Coverage...
research
07/21/2018

Streaming Methods for Restricted Strongly Convex Functions with Applications to Prototype Selection

In this paper, we show that if the optimization function is restricted-s...

Please sign up or login with your details

Forgot password? Click here to reset