An Improved Reinforcement Learning Algorithm for Learning to Branch

01/17/2022
by   Qingyu Qu, et al.
14

Most combinatorial optimization problems can be formulated as mixed integer linear programming (MILP), in which branch-and-bound (B&B) is a general and widely used method. Recently, learning to branch has become a hot research topic in the intersection of machine learning and combinatorial optimization. In this paper, we propose a novel reinforcement learning-based B&B algorithm. Similar to offline reinforcement learning, we initially train on the demonstration data to accelerate learning massively. With the improvement of the training effect, the agent starts to interact with the environment with its learned policy gradually. It is critical to improve the performance of the algorithm by determining the mixing ratio between demonstration and self-generated data. Thus, we propose a prioritized storage mechanism to control this ratio automatically. In order to improve the robustness of the training process, a superior network is additionally introduced based on Double DQN, which always serves as a Q-network with competitive performance. We evaluate the performance of the proposed algorithm over three public research benchmarks and compare it against strong baselines, including three classical heuristics and one state-of-the-art imitation learning-based branching algorithm. The results show that the proposed algorithm achieves the best performance among compared algorithms and possesses the potential to improve B&B algorithm performance continuously.

READ FULL TEXT
research
06/14/2022

Deep Reinforcement Learning for Exact Combinatorial Optimization: Learning to Branch

Branch-and-bound is a systematic enumerative method for combinatorial op...
research
02/02/2022

Yordle: An Efficient Imitation Learning for Branch and Bound

Combinatorial optimization problems have aroused extensive research inte...
research
06/04/2019

Exact Combinatorial Optimization with Graph Convolutional Neural Networks

Combinatorial optimization problems are typically tackled by the branch-...
research
07/26/2022

Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-based Policy Learning

Deriving a good variable selection strategy in branch-and-bound is essen...
research
07/03/2019

Co-training for Policy Learning

We study the problem of learning sequential decision-making policies in ...
research
12/08/2020

Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem

We address the Traveling Salesman Problem (TSP), a famous NP-hard combin...
research
10/31/2019

Deep Reinforcement Learning-Based Topology Optimization for Self-Organized Wireless Sensor Networks

Wireless sensor networks (WSNs) are the foundation of the Internet of Th...

Please sign up or login with your details

Forgot password? Click here to reset