Obtaining versions of deep neural networks that are both highly-accurate...
Knowledge distillation is a popular approach for enhancing the performan...
Pruning - that is, setting a significant subset of the parameters of a n...
We examine the question of whether SGD-based optimization of deep neural...
Transfer learning is a classic paradigm by which models pretrained on la...
The availability of large amounts of user-provided data has been key to ...
The increasing computational requirements of deep neural networks (DNNs)...
The growing energy and performance costs of deep learning have driven th...
In this paper we propose two novel bounds for the log-likelihood based o...