Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers

08/12/2021
by   Anh Tran, et al.
0

Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable^3-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable^3-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational resource on HPC within a computational budget. As a result, the proposed Scalable^3-BO framework is scalable in three independent perspectives: with respect to data size, dimensionality, and computational resource on HPC. The goal of this work is to push the frontiers of BO beyond its well-known scalability issues and minimize the wall-clock waiting time for optimizing high-dimensional computationally expensive applications. We demonstrate the capability of Scalable^3-BO with 1 million data points, 10,000-dimensional problems, with 20 concurrent workers in an HPC environment.

READ FULL TEXT

page 1

page 8

research
07/01/2022

Asynchronous Distributed Bayesian Optimization at HPC Scale

Bayesian optimization (BO) is a widely used approach for computationally...
research
05/18/2023

Neuromorphic Bayesian Optimization in Lava

The ever-increasing demands of computationally expensive and high-dimens...
research
10/03/2022

HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization

Distributed data storage services tailored to specific applications have...
research
05/16/2022

A model aggregation approach for high-dimensional large-scale optimization

Bayesian optimization (BO) has been widely used in machine learning and ...
research
09/30/2013

On statistics, computation and scalability

How should statistical procedures be designed so as to be scalable compu...
research
09/07/2022

Parallel and Streaming Wavelet Neural Networks for Classification and Regression under Apache Spark

Wavelet neural networks (WNN) have been applied in many fields to solve ...

Please sign up or login with your details

Forgot password? Click here to reset