Learning Discretized Neural Networks under Ricci Flow

02/07/2023
by   Jun Chen, et al.
0

In this paper, we consider Discretized Neural Networks (DNNs) consisting of low-precision weights and activations, which suffer from either infinite or zero gradients caused by the non-differentiable discrete function in the training process. In this case, most training-based DNNs use the standard Straight-Through Estimator (STE) to approximate the gradient w.r.t. discrete values. However, the STE will cause the problem of gradient mismatch, which implies that the approximated gradient is with perturbations. We propose an analysis that this mismatch can be viewed as a metric perturbation in a Riemannian manifold through the lens of duality theory. To address this problem, based on the information geometry, we construct the Linearly Nearly Euclidean (LNE) manifold for DNNs as a background to deal with perturbations. By introducing a partial differential equation on metrics, the Ricci flow, we prove the dynamical stability and convergence of the LNE metric with the L^2-norm perturbation. And unlike the previous perturbation theory which gives the rate of convergence is the fractional powers, we yield the metric perturbation under the Ricci flow can be exponentially decayed in the LNE manifold. The experimental results on various datasets demonstrate that our method achieves better and more stable performance for DNNs than other representative training-based methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2021

Thoughts on the Consistency between Ricci Flow and Neural Network Behavior

The Ricci flow is a partial differential equation for evolving the metri...
research
08/18/2021

Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics

We introduce a novel geometry-informed irreversible perturbation that ac...
research
12/21/2021

Dynamically Stable Poincaré Embeddings for Neural Manifolds

In a Riemannian manifold, the Ricci flow is a partial differential equat...
research
02/08/2023

Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

The minimax optimization over Riemannian manifolds (possibly nonconvex c...
research
05/12/2022

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs

The optimization with orthogonality has been shown useful in training de...
research
09/18/2022

EMaP: Explainable AI with Manifold-based Perturbations

In the last few years, many explanation methods based on the perturbatio...
research
04/16/2020

Convergence of Eigenvector Continuation

Eigenvector continuation is a computational method that finds the extrem...

Please sign up or login with your details

Forgot password? Click here to reset