Higher-order Derivatives of Weighted Finite-state Machines

06/01/2021
by   Ran Zmigrod, et al.
0

Weighted finite-state machines are a fundamental building block of NLP systems. They have withstood the test of time – from their early use in noisy channel models in the 1990s up to modern-day neurally parameterized conditional random fields. This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines. We provide a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature. In the case of second-order derivatives, our scheme runs in the optimal 𝒪(A^2 N^4) time where A is the alphabet size and N is the number of states. Our algorithm is significantly faster than prior algorithms. Additionally, our approach leads to a significantly faster algorithm for computing second-order expectations, such as covariance matrices and gradients of first-order expectations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

Higher order derivatives of matrix functions

We present theory for general partial derivatives of matrix functions on...
research
07/20/2021

Adjoint based methods to compute higher order topological derivatives with an application to elasticity

The goal of this paper is to give a comprehensive and short review on ho...
research
12/07/2022

Efficient Optimization with Higher-Order Ising Machines

A prominent approach to solving combinatorial optimization problems on p...
research
02/22/2023

SymX: Energy-based Simulation from Symbolic Expressions

Optimization time integrators have proven to be effective at solving com...
research
08/29/2020

Efficient Computation of Expectations under Spanning Tree Distributions

We give a general framework for inference in spanning tree models. We pr...
research
02/11/2021

Polynomial Approximations of Conditional Expectations in Scalar Gaussian Channels

We consider a channel Y=X+N where X is a random variable satisfying 𝔼[|X...
research
05/30/2022

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

In this paper, we prove the effects of the BN operation on the back-prop...

Please sign up or login with your details

Forgot password? Click here to reset