Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processes

Bayesian Optimization (BO) methods are useful for optimizing functions that are expen- sive to evaluate, lack an analytical expression and whose evaluations can be contaminated by noise. These methods rely on a probabilistic model of the objective function, typically a Gaussian process (GP), upon which an acquisition function is built. The acquisition function guides the optimization process and measures the expected utility of performing an evaluation of the objective at a new point. GPs assume continous input variables. When this is not the case, for example when some of the input variables take categorical or integer values, one has to introduce extra approximations. Consider a suggested input location taking values in the real line. Before doing the evaluation of the objective, a common approach is to use a one hot encoding approximation for categorical variables, or to round to the closest integer, in the case of integer-valued variables. We show that this can lead to problems in the optimization process and describe a more principled approach to account for input variables that are categorical or integer-valued. We illustrate in both synthetic and a real experiments the utility of our approach, which significantly improves the results of standard BO methods using Gaussian processes on problems with categorical or integer-valued variables.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2017

Dealing with Integer-valued Variables in Bayesian Optimization with Gaussian Processes

Bayesian optimization (BO) methods are useful for optimizing functions t...
research
02/01/2019

Combinatorial Bayesian Optimization using Graph Representations

This paper focuses on Bayesian Optimization - typically considered with ...
research
02/07/2020

Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization

We consider the problem of robust optimization within the well-establish...
research
08/26/2019

Sufficient Representations for Categorical Variables

Many learning algorithms require categorical data to be transformed into...
research
12/31/2021

Bayesian Optimization of Function Networks

We consider Bayesian optimization of the output of a network of function...
research
12/17/2015

Probabilistic Programming with Gaussian Process Memoization

Gaussian Processes (GPs) are widely used tools in statistics, machine le...
research
03/14/2018

SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping

This paper presents a new method, which we call SUSTain, that extends re...

Please sign up or login with your details

Forgot password? Click here to reset