Learning to branch with Tree MDPs

05/23/2022
by   Lara Scavuzzo, et al.
0

State-of-the-art Mixed Integer Linear Program (MILP) solvers combine systematic tree search with a plethora of hard-coded heuristics, such as the branching rule. The idea of learning branching rules from data has received increasing attention recently, and promising results have been obtained by learning fast approximations of the strong branching expert. In this work, we instead propose to learn branching rules from scratch via Reinforcement Learning (RL). We revisit the work of Etheve et al. (2020) and propose tree Markov Decision Processes, or tree MDPs, a generalization of temporal MDPs that provides a more suitable framework for learning to branch. We derive a tree policy gradient theorem, which exhibits a better credit assignment compared to its temporal counterpart. We demonstrate through computational experiments that tree MDPs improve the learning convergence, and offer a promising framework for tackling the learning-to-branch problem in MILPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2022

Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes

Policy gradient (PG) is a reinforcement learning (RL) approach that opti...
research
07/26/2022

Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-based Policy Learning

Deriving a good variable selection strategy in branch-and-bound is essen...
research
02/10/2021

Risk-Averse Bayes-Adaptive Reinforcement Learning

In this work, we address risk-averse Bayesadaptive reinforcement learnin...
research
05/28/2022

Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories

Combinatorial optimisation problems framed as mixed integer linear progr...
research
09/24/2021

Regularization Guarantees Generalization in Bayesian Reinforcement Learning through Algorithmic Stability

In the Bayesian reinforcement learning (RL) setting, a prior distributio...
research
02/12/2020

Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies

Branch and Bound (B B) is the exact tree search method typically used ...
research
06/20/2023

Reward Shaping via Diffusion Process in Reinforcement Learning

Reinforcement Learning (RL) models have continually evolved to navigate ...

Please sign up or login with your details

Forgot password? Click here to reset