Chain Rules for Hessian and Higher Derivatives Made Easy by Tensor Calculus

11/29/2019
by   Maciej Skorski, et al.
0

Computing multivariate derivatives of matrix-like expressions in the compact, coordinate free fashion is very important for both theory and applied computations (e.g. optimization and machine learning). The critical components of such computations are chain and product rules for derivatives. Although they are taught early in simple scenarios, practical applications involve high-dimensional arrays; in this context it is very hard to find easy accessible and compact explanation. This paper discusses how to relatively simply carry such derivations based on the (simplified as adapted in applied computer science) concept of tensors. Numerical examples in modern Python libraries are provided. This discussion simplifies and illustrates an earlier exposition by Manton (2012).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2020

A Simple and Efficient Tensor Calculus for Machine Learning

Computing derivatives of tensor expressions, also known as tensor calcul...
research
07/12/2021

On the Computational Complexity of the Chain Rule of Differential Calculus

Many modern numerical methods in computational science and engineering r...
research
07/05/2022

Confluent Vandermonde with Arnoldi

In this note, we extend the Vandermonde with Arnoldi method recently adv...
research
12/11/2017

automan: a simple, Python-based, automation framework for numerical computing

We present an easy-to-use, Python-based framework that allows a research...
research
02/02/2020

The Discrete Adjoint Method: Efficient Derivatives for Functions of Discrete Sequences

Gradient-based techniques are becoming increasingly critical in quantita...
research
11/28/2019

LL(1) Parsing with Derivatives and Zippers

In this paper, we present an efficient, functional, and formally verifie...

Please sign up or login with your details

Forgot password? Click here to reset