Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach

07/02/2020
by   Luofeng Liao, et al.
13

Structural equation models (SEMs) are widely used in sciences, ranging from economics to psychology, to uncover causal relationships underlying a complex system under consideration and estimate structural parameters of interest. We study estimation in a class of generalized SEMs where the object of interest is defined as the solution to a linear operator equation. We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using the stochastic gradient descent. We consider both 2-layer and multi-layer NNs with ReLU activation functions and prove global convergence in an overparametrized regime, where the number of neurons is diverging. The results are established using techniques from online learning and local linearization of NNs, and improve in several aspects the current state-of-the-art. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2022

Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks

We study the overparametrization bounds required for the global converge...
research
10/11/2021

A global convergence theory for deep ReLU implicit networks via over-parameterization

Implicit deep learning has received increasing attention recently due to...
research
02/06/2020

Global Convergence of Frank Wolfe on One Hidden Layer Networks

We derive global convergence bounds for the Frank Wolfe algorithm when t...
research
04/22/2022

On Feature Learning in Neural Networks with Global Convergence Guarantees

We study the optimization of wide neural networks (NNs) via gradient flo...
research
02/18/2018

Local Geometry of One-Hidden-Layer Neural Networks for Logistic Regression

We study the local geometry of a one-hidden-layer fully-connected neural...
research
09/09/2018

Stochastic Gradient Descent Learns State Equations with Nonlinear Activations

We study discrete time dynamical systems governed by the state equation ...

Please sign up or login with your details

Forgot password? Click here to reset