Yann N. Dauphin

research

∙ 06/05/2023

Has the Machine Learning Review Process Become More Arbitrary as the Field Has Grown? The NeurIPS 2021 Consistency Experiment

We present the NeurIPS 2021 consistency experiment, a larger-scale varia...

0 Alina Beygelzimer, et al. ∙

research

∙ 02/17/2023

SAM operates far from home: eigenvalue regularization as a dynamical phenomenon

The Sharpness Aware Minimization (SAM) optimization algorithm has been s...

0 Atish Agarwala, et al. ∙

research

∙ 11/22/2022

How do Authors' Perceptions of their Papers Compare with Co-authors' Perceptions and Peer-review Decisions?

How do author perceptions match up to the outcomes of the peer-review pr...

0 Charvi Rastogi, et al. ∙

research

∙ 08/15/2019

Simple and Effective Noisy Channel Modeling for Neural Machine Translation

Previous work on neural noisy channel modeling relied on latent variable...

0 Kyra Yee, et al. ∙

research

∙ 01/29/2019

Pay Less Attention with Lightweight and Dynamic Convolutions

Self-attention is a useful mechanism to build generative models for lang...

0 Felix Wu, et al. ∙

research

∙ 01/27/2019

Fixup Initialization: Residual Learning Without Normalization

Normalization layers are a staple in state-of-the-art deep neural networ...

8 Hongyi Zhang, et al. ∙

research

∙ 10/25/2017

mixup: Beyond Empirical Risk Minimization

Large deep neural networks are powerful, but exhibit undesirable behavio...

0 Hongyi Zhang, et al. ∙

research

∙ 06/16/2017

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Much of human dialogue occurs in semi-cooperative settings, where agents...

0 Mike Lewis, et al. ∙

research

∙ 05/08/2017

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input se...

0 Jonas Gehring, et al. ∙

research

∙ 12/23/2016

Language Modeling with Gated Convolutional Networks

The pre-dominant approach to language modeling to date is based on recur...

0 Yann N. Dauphin, et al. ∙

research

∙ 11/07/2016

A Convolutional Encoder Model for Neural Machine Translation

The prevalent approach to neural machine translation relies on bi-direct...

0 Jonas Gehring, et al. ∙

research

∙ 05/09/2016

Theano: A Python framework for fast computation of mathematical expressions

Theano is a Python library that allows to define, optimize, and evaluate...

0 The Theano Development Team, et al. ∙

research

∙ 02/15/2015

Equilibrated adaptive learning rates for non-convex optimization

Parameter-specific adaptive learning rate methods are computationally ef...

0 Yann N. Dauphin, et al. ∙

research

∙ 05/19/2014

On the saddle point problem for non-convex optimization

A central challenge to many fields of science and engineering involves m...

0 Razvan Pascanu, et al. ∙

research

∙ 01/16/2013

Big Neural Networks Waste Capacity

This article exposes the failure of some big neural networks to leverage...

0 Yann N. Dauphin, et al. ∙

Yann N. Dauphin

Featured Co-authors

Sign in with Google

Consider DeepAI Pro