An LP-based hyperparameter optimization model for language modeling

In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find perplexity values in the language modeling literature. We apply our model to find hyperparameters of a language model and compare it to the grid search algorithm. Furthermore, we illustrating that it results in lower perplexity values. We perform this experiment on a real-world dataset from SwiftKey to validate our proposed approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2020

A Gradient-based Bilevel Optimization Approach for Tuning Hyperparameters in Machine Learning

Hyperparameter tuning is an active area of research in machine learning,...
research
08/25/2022

A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization Method

Hyperparameter optimization in machine learning is often achieved using ...
research
02/08/2023

Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

Hyperparameter optimization (HPO) can be an important step in machine le...
research
04/06/2020

Online Hyperparameter Search Interleaved with Proximal Parameter Updates

There is a clear need for efficient algorithms to tune hyperparameters f...
research
05/04/2020

Cost Effective Optimization for Cost-related Hyperparameters

The increasing demand for democratizing machine learning algorithms for ...
research
04/28/2017

DeepArchitect: Automatically Designing and Training Deep Architectures

In deep learning, performance is strongly affected by the choice of arch...
research
08/26/2020

How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18 search algorithms

SVM with an RBF kernel is usually one of the best classification algorit...

Please sign up or login with your details

Forgot password? Click here to reset