
Label Noise SGD Provably Prefers Flat Global Minimizers
In overparametrized models, the noise in stochastic gradient descent (SG...
read it

Iterative Refinement in the Continuous Space for NonAutoregressive Neural Machine Translation
We propose an efficient inference procedure for nonautoregressive machi...
read it

On the Discrepancy between Density Estimation and Sequence Generation
Many sequencetosequence generation tasks, including machine translatio...
read it

Deploying the NASA Valkyrie Humanoid for IED Response: An Initial Approach and Evaluation Summary
As part of a feasibility study, this paper shows the NASA Valkyrie human...
read it

Countering Language Drift via Visual Grounding
Emergent multiagent communication protocols are very different from nat...
read it

LatentVariable NonAutoregressive Neural Machine Translation with Deterministic Inference using a Delta Posterior
Although neural machine translation models reached high translation qual...
read it

Kernel and Deep Regimes in Overparametrized Models
A recent line of work studies overparametrized neural networks in the "k...
read it

MultiTurn Beam Search for Neural Dialogue Modeling
In neural dialogue modeling, a neural network is trained to predict the ...
read it

MultiScale Distributed Representation for Deep Learning and its Application to bJet Tagging
Recently machine learning algorithms based on deep layered artificial ne...
read it

Implicit Bias of Gradient Descent on Linear Convolutional Networks
We show that gradient descent on fullwidth linear convolutional network...
read it

Convergence of Gradient Descent on Separable Data
The implicit bias of gradient descent is not fully understood even in si...
read it

Characterizing Implicit Bias in Terms of Optimization Geometry
We study the bias of generic optimization methods, including Mirror Desc...
read it

Deterministic NonAutoregressive Neural Sequence Modeling by Iterative Refinement
We propose a conditional nonautoregressive neural sequence model based ...
read it

Using HighSpeed WANs and Network Data Caches to Enable Remote and Distributed Visualization
Visapult is a prototype application and framework for remote visualizati...
read it

Emergent Translation in MultiAgent Communication
While most machine translation systems to date are trained on large para...
read it

Fully CharacterLevel Neural Machine Translation without Explicit Segmentation
Most existing machine translation systems operate at the level of words,...
read it

Matrix Completion and LowRank SVD via Fast Alternating Least Squares
The matrixcompletion problem has attracted a lot of attention, largely ...
read it
Jason Lee
is this you? claim profile