Statistical Mechanics of Node-perturbation Learning with Noisy Baseline

06/20/2017
by   Kazuyuki Hara, et al.
0

Node-perturbation learning is a type of statistical gradient descent algorithm that can be applied to problems where the objective function is not explicitly formulated, including reinforcement learning. It estimates the gradient of an objective function by using the change in the object function in response to the perturbation. The value of the objective function for an unperturbed output is called a baseline. Cho et al. proposed node-perturbation learning with a noisy baseline. In this paper, we report on building the statistical mechanics of Cho's model and on deriving coupled differential equations of order parameters that depict learning dynamics. We also show how to derive the generalization error by solving the differential equations of order parameters. On the basis of the results, we show that Cho's results are also apply in general cases and show some general performances of Cho's model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2022

Sobolev Acceleration and Statistical Optimality for Learning Elliptic Equations via Gradient Descent

In this paper, we study the statistical limits in terms of Sobolev norms...
research
07/03/2020

Weak error analysis for stochastic gradient descent optimization algorithms

Stochastic gradient descent (SGD) type optimization schemes are fundamen...
research
05/23/2019

KNG: The K-Norm Gradient Mechanism

This paper presents a new mechanism for producing sanitized statistical ...
research
11/14/2016

Statistical mechanics of the inverse Ising problem and the optimal objective function

The inverse Ising problem seeks to reconstruct the parameters of an Isin...
research
10/16/2018

A Direct Method to Learn States and Parameters of Ordinary Differential Equations

Though ordinary differential equations (ODE) are used extensively in sci...
research
11/09/2009

Analysis of peeling decoder for MET ensembles

The peeling decoder introduced by Luby, et al. allows analysis of LDPC d...
research
03/27/2013

Statistical Mechanics Algorithm for Response to Targets (SMART)

It is proposed to apply modern methods of nonlinear nonequilibrium stati...

Please sign up or login with your details

Forgot password? Click here to reset