
Halting Time is Predictable for Large Models: A Universality Property and Averagecase Analysis
Averagecase analysis computes the complexity of an algorithm averaged o...
Fast Training of Sparse Graph Neural Networks on Dense Hardware
Graph neural networks have become increasingly popular in recent years d...
Information matrices and generalization
This work revisits the use of information criteria to characterize the g...
Automatic differentiation in ML: Where we are and where we should be going
We review the current state of automatic differentiation (AD) for array ...
Tangent: Automatic differentiation using sourcecode transformation for dynamically typed array programming
The need to efficiently calculate first and higherorder derivatives of...
Tangent: Automatic Differentiation Using Source Code Transformation in Python
Automatic differentiation (AD) is an essential primitive for machine lea...
Multiscale sequence modeling with a learned dictionary
We propose a generalization of neural network sequence models. Instead o...
Theano: A Python framework for fast computation of mathematical expressions
Theano is a Python library that allows to define, optimize, and evaluate...
Blocks and Fuel: Frameworks for deep learning
We introduce two Python frameworks to train neural networks on large dat...
Towards AIComplete Question Answering: A Set of Prerequisite Toy Tasks
One longterm goal of machine learning research is to produce methods th...
Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation
The authors of (Cho et al., 2014a) have shown that the recently introduc...
Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation
In this paper, we propose a novel neural network model called RNN Encode...
