
Robustness to Pruning Predicts Generalization in Deep Neural Networks
Existing generalization measures that aim to capture a model's simplicit...
read it

Interlocking Backpropagation: Improving depthwise modelparallelism
The number of parameters in state of the art neural networks has drastic...
read it

SliceOut: Training Transformers and CNNs faster while using less memory
We demonstrate 1040 EfficientNets, and Transformer models, with minimal...
read it

Wat zei je? Detecting OutofDistribution Translations with Variational Transformers
We detect outoftrainingdistribution sentences in Neural Machine Trans...
read it

A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks
Evaluation of Bayesian deep learning (BDL) methods is challenging. We of...
read it

Learning Sparse Networks Using Targeted Dropout
Neural networks are easier to optimise when they have many more weights ...
read it

Tensor2Tensor for Neural Machine Translation
Tensor2Tensor is a library for deep learning models that is wellsuited ...
read it

Unsupervised Cipher Cracking Using Discrete GANs
This work details CipherGAN, an architecture inspired by CycleGAN used f...
read it

The Reversible Residual Network: Backpropagation Without Storing Activations
Deep residual networks (ResNets) have significantly pushed forward the s...
read it

One Model To Learn Them All
Deep learning yields great results across many fields, from speech recog...
read it

Attention Is All You Need
The dominant sequence transduction models are based on complex recurrent...
read it

Depthwise Separable Convolutions for Neural Machine Translation
Depthwise separable convolutions reduce the number of parameters and com...
read it
Aidan N. Gomez
is this you? claim profile