Tight Bound of Incremental Cover Trees for Dynamic Diversification

06/15/2018
by   Hannah Marienwald, et al.
0

Dynamic diversification---finding a set of data points with maximum diversity from a time-dependent sample pool---is an important task in recommender systems, web search, database search, and notification services, to avoid showing users duplicate or very similar items. The incremental cover tree (ICT) with high computational efficiency and flexibility has been applied to this task, and shown good performance. Specifically, it was empirically observed that ICT typically provides a set with its diversity only marginally (∼ 1/ 1.2 times) worse than the greedy max-min (GMM) algorithm, the state-of-the-art method for static diversification with its performance bound optimal for any polynomial time algorithm. Nevertheless, the known performance bound for ICT is 4 times worse than this optimal bound. With this paper, we aim to fill this very gap between theory and empirical observations. For achieving this, we first analyze variants of ICT methods, and derive tighter performance bounds. We then investigate the gap between the obtained bound and empirical observations by using specially designed artificial data for which the optimal diversity is known. Finally, we analyze the tightness of the bound, and show that the bound cannot be further improved, i.e., this paper provides the tightest possible bound for ICT methods. In addition, we demonstrate a new use of dynamic diversification for generative image samplers, where prototypes are incrementally collected from a stream of artificial images generated by an image sampler.

READ FULL TEXT
research
10/12/2019

Improved (In-)Approximability Bounds for d-Scattered Set

In the d-Scattered Set problem we are asked to select at least k vertice...
research
12/19/2021

Conditional Lower Bounds for Dynamic Geometric Measure Problems

We give new polynomial lower bounds for a number of dynamic measure prob...
research
01/08/2023

Dynamic Binary Search Trees: Improved Lower Bounds for the Greedy-Future Algorithm

Binary search trees (BSTs) are one of the most basic and widely used dat...
research
05/12/2016

Competitive analysis of the top-K ranking problem

Motivated by applications in recommender systems, web search, social cho...
research
10/18/2021

Result Diversification by Multi-objective Evolutionary Algorithms with Theoretical Guarantees

Given a ground set of items, the result diversification problem aims to ...
research
06/05/2021

APMF < APSP? Gomory-Hu Tree for Unweighted Graphs in Almost-Quadratic Time

We design an n^2+o(1)-time algorithm that constructs a cut-equivalent (G...
research
01/18/2021

Data Obsolescence Detection in the Light of Newly Acquired Valid Observations

The information describing the conditions of a system or a person is con...

Please sign up or login with your details

Forgot password? Click here to reset