Just because some purely recurrent models suffer from being hard to opti...
Tail averaging improves on Polyak averaging's non-asymptotic behaviour b...
A common failure mode of density models trained as variational autoencod...
Neural machine translation (NMT) has arguably achieved human level parit...
A series of recent papers has used a parsing algorithm due to Shen et al...
Many advances in Natural Language Processing have been based upon more
e...
Recurrent neural network grammars (RNNG) are generative models of langua...
We present a new theoretical perspective of data noising in recurrent ne...
Natural language processing has made significant inroads into learning t...
We show that dropout training is best understood as performing MAP estim...
Reading comprehension (RC)---in contrast to information retrieval---requ...
Ongoing innovations in recurrent neural network architectures have provi...
We present a novel semi-supervised approach for sequence transduction an...