
Learning and Generalization in Overparameterized Normalizing Flows
In supervised learning, it is known that overparameterized neural networ...
read it

Learning and Generalization in RNNs
Simple recurrent neural networks (RNNs) and their more advanced cousins ...
read it

Analyzing the Nuances of Transformers' Polynomial Simplification Abilities
Symbolic Mathematical tasks such as integration often require multiple w...
read it

Are NLP Models really able to Solve Simple Math Word Problems?
The problem of designing NLP solvers for math word problems (MWP) has se...
read it

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages
While recurrent models have been effective in NLP tasks, their performan...
read it

On the Ability of SelfAttention Networks to Recognize Counter Languages
Transformers have supplanted recurrent models in a large number of NLP t...
read it

Robust Identifiability in Linear Structural Equation Models of Causal Inference
In this work, we consider the problem of robust parameter estimation fro...
read it

On the Computational Power of Transformers and Its Implications in Sequence Modeling
Transformers are being used extensively across several sequence modeling...
read it

NonGaussianity of Stochastic Gradient Noise
What enables Stochastic Gradient Descent (SGD) to achieve better general...
read it

Effect of Activation Functions on the Training of Overparametrized Neural Nets
It is wellknown that overparametrized neural networks trained using gra...
read it

Sampling and Optimization on Convex Sets in Riemannian Manifolds of NonNegative Curvature
The Euclidean space notion of convex sets (and functions) generalizes to...
read it

Stability of Linear Structural Equation Models of Causal Inference
We consider the numerical stability of the parameter recovery problem in...
read it

NonGaussian Component Analysis using Entropy Methods
NonGaussian component analysis (NGCA) is a problem in multidimensional ...
read it

HeavyTailed Analogues of the Covariance Matrix for ICA
Independent Component Analysis (ICA) is the problem of learning a square...
read it

Heavytailed Independent Component Analysis
Independent component analysis (ICA) is the problem of efficiently recov...
read it

The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures
In this paper we show that very large mixtures of Gaussians are efficien...
read it

Fourier PCA and Robust Tensor Decomposition
Fourier PCA is Principal Component Analysis of a matrix obtained from hi...
read it

Efficient learning of simplices
We show an efficient algorithm for the following problem: Given uniforml...
read it

Further Optimal Regret Bounds for Thompson Sampling
Thompson Sampling is one of the oldest heuristics for multiarmed bandit...
read it

Thompson Sampling for Contextual Bandits with Linear Payoffs
Thompson Sampling is one of the oldest heuristics for multiarmed bandit...
read it

An Efficient Approximation Algorithm for Point Pattern Matching Under Noise
Point pattern matching problems are of fundamental importance in various...
read it
Navin Goyal
is this you? claim profile