Data Summarization at Scale: A Two-Stage Submodular Approach

06/07/2018
by   Marko Mitrovic, et al.
0

The sheer scale of modern datasets has resulted in a dire need for summarization techniques that identify representative elements in a dataset. Fortunately, the vast majority of data summarization tasks satisfy an intuitive diminishing returns condition known as submodularity, which allows us to find nearly-optimal solutions in linear time. We focus on a two-stage submodular framework where the goal is to use some given training functions to reduce the ground set so that optimizing new functions (drawn from the same distribution) over the reduced set provides almost as much value as optimizing them over the entire ground set. In this paper, we develop the first streaming and distributed solutions to this problem. In addition to providing strong theoretical guarantees, we demonstrate both the utility and efficiency of our algorithms on real-world tasks including image summarization and ride-share optimization.

READ FULL TEXT

page 19

page 24

research
09/11/2023

Data Summarization beyond Monotonicity: Non-monotone Two-Stage Submodular Maximization

The objective of a two-stage submodular maximization problem is to reduc...
research
02/26/2019

A Memoization Framework for Scaling Submodular Optimization to Large Scale Problems

We are motivated by large scale submodular optimization problems, where ...
research
10/16/2020

Deep Submodular Networks for Extractive Data Summarization

Deep Models are increasingly becoming prevalent in summarization problem...
research
10/16/2012

Learning Mixtures of Submodular Shells with Application to Document Summarization

We introduce a method to learn a mixture of submodular "shells" in a lar...
research
02/14/2018

Distributionally Robust Submodular Maximization

Submodular functions have applications throughout machine learning, but ...
research
02/26/2018

Submodularity on Hypergraphs: From Sets to Sequences

In a nutshell, submodular functions encode an intuitive notion of dimini...
research
05/11/2013

Learning Policies for Contextual Submodular Prediction

Many prediction domains, such as ad placement, recommendation, trajector...

Please sign up or login with your details

Forgot password? Click here to reset