Reward Potentials for Planning with Learned Neural Network Transition Models

04/19/2019
by   Buser Say, et al.
0

Optimal planning with respect to learned neural network (NN) models in continuous action and state spaces using mixed-integer linear programming (MILP) is a challenging task for branch-and-bound solvers due to the poor linear relaxation of the underlying MILP model. For a given set of features, potential heuristics provide an efficient framework for computing bounds on cost (reward) functions. In this paper, we introduce a finite-time algorithm for computing an optimal potential heuristic for learned NN models. We then strengthen the linear relaxation of the underlying MILP model by introducing constraints to bound the reward function based on the precomputed reward potentials. Experimentally, we show that our algorithm efficiently computes reward potentials for learned NN models, and the overhead of computing reward potentials is justified by the overall strengthening of the underlying MILP model for the task of planning over long-term horizons.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2023

Safe Robot Learning in Assistive Devices through Neural Network Repair

Assistive robotic devices are a particularly promising field of applicat...
research
08/23/2022

Convex integer optimization with Frank-Wolfe methods

Mixed-integer nonlinear optimization is a broad class of problems that f...
research
01/27/2021

Adversarial Attacks on Uncertainty Enable Active Learning for Neural Network Potentials

Neural network (NN)-based interatomic potentials provide fast prediction...
research
02/20/2020

Contextual Reserve Price Optimization in Auctions

We study the problem of learning a linear model to set the reserve price...
research
11/01/2017

Piecewise Linear Neural Network verification: A comparative study

The success of Deep Learning and its potential use in many important saf...
research
06/02/2021

Learning neural network potentials from experimental data via Differentiable Trajectory Reweighting

In molecular dynamics (MD), neural network (NN) potentials trained botto...
research
08/15/2022

Non-Blocking Batch A* (Technical Report)

Heuristic search has traditionally relied on hand-crafted or programmati...

Please sign up or login with your details

Forgot password? Click here to reset