Aidan N. Gomez

research

∙ 09/27/2022

Exploring Low Rank Training of Deep Neural Networks

Training deep neural networks in low rank, i.e. with factorised layers, ...

0 Siddhartha Rao Kamalakara, et al. ∙

research

∙ 06/14/2022

Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

Training on web-scale data can take months. But most computation and tim...

12 Sören Mindermann, et al. ∙

research

∙ 07/06/2021

Prioritized training on points that are learnable, worth learning, and not yet learned

We introduce Goldilocks Selection, a technique for faster model training...

5 Sören Mindermann, et al. ∙

research

∙ 06/04/2021

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

We challenge a common assumption underlying most supervised deep learnin...

19 Jannik Kossen, et al. ∙

research

∙ 03/10/2021

Robustness to Pruning Predicts Generalization in Deep Neural Networks

Existing generalization measures that aim to capture a model's simplicit...

0 Lorenz Kuhn, et al. ∙

research

∙ 10/08/2020

Interlocking Backpropagation: Improving depthwise model-parallelism

The number of parameters in state of the art neural networks has drastic...

11 Aidan N. Gomez, et al. ∙

research

∙ 07/21/2020

SliceOut: Training Transformers and CNNs faster while using less memory

We demonstrate 10-40 EfficientNets, and Transformer models, with minimal...

8 Pascal Notin, et al. ∙

research

∙ 06/08/2020

Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers

We detect out-of-training-distribution sentences in Neural Machine Trans...

0 Tim Z. Xiao, et al. ∙

research

∙ 12/22/2019

A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

Evaluation of Bayesian deep learning (BDL) methods is challenging. We of...

86 Angelos Filos, et al. ∙

research

∙ 05/31/2019

Learning Sparse Networks Using Targeted Dropout

Neural networks are easier to optimise when they have many more weights ...

26 Aidan N. Gomez, et al. ∙

research

∙ 03/16/2018

Tensor2Tensor for Neural Machine Translation

Tensor2Tensor is a library for deep learning models that is well-suited ...

0 Ashish Vaswani, et al. ∙

research

∙ 01/15/2018

Unsupervised Cipher Cracking Using Discrete GANs

This work details CipherGAN, an architecture inspired by CycleGAN used f...

0 Aidan N. Gomez, et al. ∙

research

∙ 07/14/2017

The Reversible Residual Network: Backpropagation Without Storing Activations

Deep residual networks (ResNets) have significantly pushed forward the s...

0 Aidan N. Gomez, et al. ∙

research

∙ 06/16/2017

One Model To Learn Them All

Deep learning yields great results across many fields, from speech recog...

0 Łukasz Kaiser, et al. ∙

research

∙ 06/12/2017

Attention Is All You Need

The dominant sequence transduction models are based on complex recurrent...

0 Ashish Vaswani, et al. ∙

research

∙ 06/09/2017

Depthwise Separable Convolutions for Neural Machine Translation

Depthwise separable convolutions reduce the number of parameters and com...

0 Łukasz Kaiser, et al. ∙

Aidan N. Gomez

Featured Co-authors

Sign in with Google

Consider DeepAI Pro