Training algorithms, broadly construed, are an essential part of every d...
Very little is known about the training dynamics of adaptive gradient me...
Bayesian optimization (BO) has become a popular strategy for global
opti...
Despite considerable progress in maternal healthcare, maternal and perin...
Black box optimization requires specifying a search space to explore for...
The performance of deep neural networks can be highly sensitive to the c...
Evaluation for many natural language understanding (NLU) tasks is broken...
Recently the LARS and LAMB optimizers have been proposed for training ne...
Selecting an optimizer is a central step in the contemporary deep learni...
In the twilight of Moore's law, GPUs and other specialized hardware
acce...
Increasing the batch size is a popular way to speed up neural network
tr...
Recent hardware developments have made unprecedented amounts of data
par...
Neural language models are a critical component of state-of-the-art syst...
Advances in machine learning have led to broad deployment of systems wit...
Natural language text exhibits hierarchical structure in a variety of
re...
We present a simple and powerful algorithm for parallel black box
optimi...
Techniques such as ensembling and distillation promise model quality
imp...
Each year, the treatment decisions for more than 230,000 breast cancer
p...
Probabilistic matrix factorization (PMF) is a powerful method for modeli...
Although artificial neural networks have occasionally been used for
Quan...
Deep Convolutional Neural Networks (CNNs) are more powerful than Deep Ne...
The restricted Boltzmann machine (RBM) is a flexible tool for modeling
c...