DeepAI AI Chat
Log In Sign Up

Optimizing Neural Network Hyperparameters with Gaussian Processes for Dialog Act Classification

by   Franck Dernoncourt, et al.

Systems based on artificial neural networks (ANNs) have achieved state-of-the-art results in many natural language processing tasks. Although ANNs do not require manually engineered features, ANNs have many hyperparameters to be optimized. The choice of hyperparameters significantly impacts models' performances. However, the ANN hyperparameters are typically chosen by manual, grid, or random search, which either requires expert experiences or is computationally expensive. Recent approaches based on Bayesian optimization using Gaussian processes (GPs) is a more systematic way to automatically pinpoint optimal or near-optimal machine learning hyperparameters. Using a previously published ANN model yielding state-of-the-art results for dialog act classification, we demonstrate that optimizing hyperparameters using GP further improves the results, and reduces the computational time by a factor of 4 compared to a random search. Therefore it is a useful technique for tuning ANN models to yield the best performances for natural language processing tasks.


page 5

page 6


CMA-ES for Hyperparameter Optimization of Deep Neural Networks

Hyperparameters of deep neural networks are often optimized by grid sear...

Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Bayesian Optimization (BO) is a common approach for hyperparameter optim...

Efficient Representation for Natural Language Processing via Kernelized Hashcodes

Kernel similarity functions have been successfully applied in classifica...

GPEX, A Framework For Interpreting Artificial Neural Networks

Machine learning researchers have long noted a trade-off between interpr...

Structural Kernel Search via Bayesian Optimization and Symbolical Optimal Transport

Despite recent advances in automated machine learning, model selection i...

Learning Structural Kernels for Natural Language Processing

Structural kernels are a flexible learning paradigm that has been widely...

Extending the Abstraction of Personality Types based on MBTI with Machine Learning and Natural Language Processing

A data-centric approach with Natural Language Processing (NLP) to predic...