Robust and Resource Efficient Identification of Two Hidden Layer Neural Networks

06/30/2019
by   Massimo Fornasier, et al.
0

We address the structure identification and the uniform approximation of two fully nonlinear layer neural networks of the type f(x)=1^T h(B^T g(A^T x)) on R^d from a small number of query samples. We approach the problem by sampling actively finite difference approximations to Hessians of the network. Gathering several approximate Hessians allows reliably to approximate the matrix subspace W spanned by symmetric tensors a_1 ⊗ a_1 ,...,a_m_0⊗ a_m_0 formed by weights of the first layer together with the entangled symmetric tensors v_1 ⊗ v_1 ,...,v_m_1⊗ v_m_1, formed by suitable combinations of the weights of the first and second layer as v_ℓ=A G_0 b_ℓ/A G_0 b_ℓ_2, ℓ∈ [m_1], for a diagonal matrix G_0 depending on the activation functions of the first layer. The identification of the 1-rank symmetric tensors within W is then performed by the solution of a robust nonlinear program. We provide guarantees of stable recovery under a posteriori verifiable conditions. We further address the correct attribution of approximate weights to the first or second layer. By using a suitably adapted gradient descent iteration, it is possible then to estimate, up to intrinsic symmetries, the shifts of the activations functions of the first layer and compute exactly the matrix G_0. Our method of identification of the weights of the network is fully constructive, with quantifiable sample complexity, and therefore contributes to dwindle the black-box nature of the network training phase. We corroborate our theoretical results by extensive numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/18/2021

Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal Samples

In this paper we approach the problem of unique and stable identifiabili...
research
09/21/2021

Neural networks with trainable matrix activation functions

The training process of neural networks usually optimize weights and bia...
research
03/06/2023

Globally Optimal Training of Neural Networks with Threshold Activation Functions

Threshold activation functions are highly preferable in neural networks ...
research
11/08/2022

Finite Sample Identification of Wide Shallow Neural Networks with Biases

Artificial neural networks are functions depending on a finite number of...
research
11/02/2019

Jacobi-type algorithm for low rank orthogonal approximation of symmetric tensors and its convergence analysis

In this paper, we propose a Jacobi-type algorithm to solve the low rank ...
research
04/04/2018

Identification of Shallow Neural Networks by Fewest Samples

We address the uniform approximation of sums of ridge functions ∑_i=1^m ...

Please sign up or login with your details

Forgot password? Click here to reset