Generalization in portfolio-based algorithm selection

12/24/2020
by   Maria-Florina Balcan, et al.
0

Portfolio-based algorithm selection has seen tremendous practical success over the past two decades. This algorithm configuration procedure works by first selecting a portfolio of diverse algorithm parameter settings, and then, on a given problem instance, using an algorithm selector to choose a parameter setting from the portfolio with strong predicted performance. Oftentimes, both the portfolio and the algorithm selector are chosen using a training set of typical problem instances from the application domain at hand. In this paper, we provide the first provable guarantees for portfolio-based algorithm selection. We analyze how large the training set should be to ensure that the resulting algorithm selector's average performance over the training set is close to its future (expected) performance. This involves analyzing three key reasons why these two quantities may diverge: 1) the learning-theoretic complexity of the algorithm selector, 2) the size of the portfolio, and 3) the learning-theoretic complexity of the algorithm's performance as a function of its parameters. We introduce an end-to-end learning-theoretic analysis of the portfolio construction and algorithm selection together. We prove that if the portfolio is large, overfitting is inevitable, even with an extremely simple algorithm selector. With experiments, we illustrate a tradeoff exposed by our theoretical analysis: as we increase the portfolio size, we can hope to include a well-suited parameter setting for every possible problem instance, but it becomes impossible to avoid overfitting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2021

Improved Learning Bounds for Branch-and-Cut

Branch-and-cut is the most widely used algorithm for solving integer pro...
research
06/21/2020

Refined bounds for algorithm configuration: The knife-edge of dual class approximability

Automating algorithm configuration is growing increasingly necessary as ...
research
08/08/2019

How much data is sufficient to learn high-performing algorithms?

Algorithms for scientific analysis typically have tunable parameters tha...
research
11/14/2020

Data-driven Algorithm Design

Data driven algorithm design is an important aspect of modern data scien...
research
10/01/2019

An Adaptive Sampling Approach for the Reduced Basis Method

The offline time of the reduced basis method can be very long given a la...
research
07/06/2020

Run2Survive: A Decision-theoretic Approach to Algorithm Selection based on Survival Analysis

Algorithm selection (AS) deals with the automatic selection of an algori...
research
03/22/2022

Remember Intentions: Retrospective-Memory-based Trajectory Prediction

To realize trajectory prediction, most previous methods adopt the parame...

Please sign up or login with your details

Forgot password? Click here to reset