Co-training for Policy Learning

07/03/2019
by   Jialin Song, et al.
5

We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.

READ FULL TEXT
research
04/06/2021

Ecole: A Library for Learning Inside MILP Solvers

In this paper we describe Ecole (Extensible Combinatorial Optimization L...
research
03/07/2020

Reinforcement Learning for Combinatorial Optimization: A Survey

Combinatorial optimization (CO) is the workhorse of numerous important a...
research
07/11/2019

Imitation-Projected Programmatic Reinforcement Learning

We study the problem of programmatic reinforcement learning, in which po...
research
01/03/2019

Constrained optimization under uncertainty for decision-making problems: Application to Real-Time Strategy games

Decision-making problems can be modeled as combinatorial optimization pr...
research
01/17/2022

An Improved Reinforcement Learning Algorithm for Learning to Branch

Most combinatorial optimization problems can be formulated as mixed inte...
research
02/15/2018

MPC-Inspired Neural Network Policies for Sequential Decision Making

In this paper we investigate the use of MPC-inspired neural network poli...
research
03/22/2018

Attention Solves Your TSP

We propose a framework for solving combinatorial optimization problems o...

Please sign up or login with your details

Forgot password? Click here to reset