Adaptive control is an online learning problem concerned with controlling an unknown dynamical system. This task is accomplished by constructing an approximation to the true dynamics through the adjustment of a set of parameters . The fundamental goal of adaptive control is stable concurrent learning and control of dynamical systems.
. An adaptive control algorithm typically consists of a parameter estimator coupled in feedback to the controlled system. While often strongly inspired by gradient-based optimization algorithms, an added complexity is that the estimator must not only be convergent, but must be stable when connected with the system in feedback.
Despite the difficulty of the problem, significant success has been achieved even for nonlinear systems in the linearly parameterized setting, where the dynamics approximation is of the form for some known regressor . Prominent examples include the adaptive robot trajectory controller of JJE. Slotine Li (1987) and the neural network-based controller of Sanner Slotine (1992), which employs a mathematical expansion in physical basis functions to uniformly approximate the unknown dynamics.
Unlike its linear counterpart, solutions to the adaptive control problem in the general nonlinearly parameterized setting
have remained elusive. Intuitively, this is unsurprising because gradient-based algorithms generally have guarantees only for convex loss functions; in the linearly parameterized setting, this requirement will be satisfied, but when the parameters appear nonlinearly, the problem is immediately in the difficult realm of non-convex optimization. Nevertheless, progress has been made in specific cases, such as with a convex or concave parameterization(Annaswamy ., 1998; Ai-Poh Loh ., 1999; Kojić Annaswamy, 2002), with a monotonicity condition (Tyukin ., 2007; I. Tyukin, 2011), through the Immersion and Invariance (I&I) approach (Astolfi Ortega, 2003; Liu ., 2010), and through the speed gradient methodology (Fradkov, 1979; A. L. Fradkov, 1999; Ortega, 1996).
Our work continues in a recent tradition that utilizes a continuous-time view to analyze optimization algorithms; see, for example, Wibisono . (2016); Betancourt . (2018); Boffi Slotine (2020); Muehlebach Jordan (2019); Diakonikolas Jordan (2019); Maddison . (2018). While the continuous-time view of optimization has seen a resurgence after it was used by Su . (2016) to provide an intuitive justification for Nesterov’s accelerated gradient method (Nesterov, 1983), continuous-time differential equations were used as early as 1964 by Polyak to derive the classical momentum or “heavy ball” optimization method (B. Polyak, 1964). Given the gradient-based nature of many adaptive control algorithms, the continuous-time view of optimization provides a natural bridge from modern optimization to modern adaptive control.
Continuous-time often affords simpler proofs, and it enables the application of physical intuition when reasoning about optimization algorithms, but finding the limiting differential equations may still be a daunting task. In a significant advance, Wibisono . (2016) showed that many accelerated methods in optimization can all be derived via a variational point of view from a single mathematical object known as the Bregman Lagrangian.
In this paper, we contribute to the linearly and nonlinearly parameterized (under the monotonicity assumptions of (Tyukin ., 2007)) problems. We utilize the Bregman Lagrangian in tandem with the speed gradient formalism (Fradkov, 1979; A. L. Fradkov, 1999) to define a general methodology to generate higher-order in-time (Morse, 1992) speed gradient algorithms. This contribution generalizes and extends a recently developed algorithm (Gaudio ., 2019), which can be seen as a special case of one of our higher-order speed gradient laws. Based on the first-order speed gradient methodology, these higher-order laws lead naturally to the development of composite higher-order adaptation algorithms for linearly parameterized systems (JJ. Slotine Li, 1991). By use of a proportional-integral (PI) form, these composite laws are driven directly by the function approximation error itself, and do not require any explicit filtering of the system dynamics. Much like the well-known reduced-order Luenberger observer, the PI form enables obtaining in the adaptation law despite the fact that this signal is not explicitly measured (Luenberger, 1979).
By analogy between the nonlinearly parameterized law presented by Tyukin . (2007) and recent results in isotonic regression (Kakade ., 2011; Goel Klivans, 2017; Goel ., 2018), we extend these higher-order algorithms to the nonlinearly parameterized setting. In a similar vein, we draw an orthogonal connecting thread to machine learning, and demonstrate a stable modification to our algorithms inspired by the Elastic Averaging Stochastic Gradient Descent (EASGD) algorithm (Zhang ., 2014). We then show how to combine all of our algorithms with time-dependent learning rates through the bounded gain forgetting formalism (JJ. Slotine Li, 1991).
The paper is organized as follows. In Sec. 2, we present some required mathematical background. This includes a basic review of direct adaptive control (Sec. 2.1), the speed gradient formalism (Sec. 2.2), the Bregman Lagrangian and a higher-order adaptive control algorithm (Sec. 2.3), and the reduced-order Luenberger observer (Sec. 2.4). Sec. 3 presents our main contributions, with our higher-order speed gradient laws in Sec. 3.1, our first-order non-filtered composite laws in Sec. 3.2, our higher-order composite laws and higher-order laws for nonlinearly parameterized systems in Sec. 3.3, the elastic modification in Sec. 3.4, and our extension to time-dependent learning rates in Sec. 3.5. We conclude with some closing remarks and future directions in Sec. 4.
2.1 Direct adaptive control
We begin with an introduction to the formalism of direct adaptive control, and describe the systems to which our results apply. For simplicity, we restrict ourselves to the class of -order nonlinear systems
where denotes the derivative of , is the overall system state,
is a vector of unknown parameters,is of known functional form but is unknown through its dependence on , and is the control input. We seek to design a feedback control law that depends on a set of estimated parameters and ensures that where is a known desired trajectory. We require that, along the way, the system remains stable and all system signals remain bounded. The estimated parameters are updated according to a learning rule or adaptation law
where must be implementable solely in terms of known system signals despite its potential dependence on . Hence, adaptive control is fundamentally an online learning problem where the data-generating process is a nonlinear dynamical system coupled in feedback to the learning process. For order systems as considered in (1), a common approach is to define the sliding variable (JJ. Slotine Li, 1991)
where we have defined and as the remainder. According to the definition (3), obeys the differential equation
Hence, from (4), we may choose to obtain the stable first-order linear filter
For future convenience, we define and we will omit its arguments when clear from the context. From the definition of in (3), defines the dynamics
Equation (6) is a stable -order filter which ensures that exponentially. For systems of the form (1), it is thus sufficient to consider the two first-order dynamics (2) and (5), and the adaptive control problem has been reduced to finding a learning algorithm that ensures .
Systems in the matched uncertainty form
where the constant pair is controllable and the constant parameter vector in the nonlinear function is unknown, can always be put in the form (1) by using a state transformation to the second controllability canonical form see Luenberger (1979), Chapter 8.8. After such a transformation, the new state variables satisfy for and . Defining as in (3) and correctly computing leads to (5). Hence, all results in this paper extend immediately to such systems.
The fundamental utility of defining the variable is its conversion of the adaptive control problem for the -order system (1) to an adaptive control problem for the first-order system (5). Our results may be simply extended to other error models (Narendra Annaswamy, 2005; Ai-Poh Loh ., 1999) of the form (5), or error models with similar input-output guarantees, as summarized by Lemma A.1.
We will use to denote the equivalent first-order system to (1), , where and .
A classic setting for adaptive control is when the unknown nonlinear dynamics depends linearly on the set of unknown parameters, that is
with a known function. In this setting, a well-known algorithm is the adaptive controller of JJ. Slotine Coetsee (1986), given by
and its extension to multi-input adaptive robot control (JJE. Slotine Li, 1987), where is a constant positive definite matrix of learning rates. Consideration of the Lyapunov-like function shows stability of the feedback interconnection of (5) and (7) via an application of Barbalat’s Lemma (JJ. Slotine Li, 1991). We will refer to (7) as the Slotine and Li controller.
In this work, we make a mild additional assumption that simplifies some of the proofs. It requires the following definition.
A function is said to be locally bounded in and uniformly in if it is bounded whenever and are finite.
Following Definition 2.1, we make the following assumption.
The dynamics is locally bounded in and uniformly in .
2.2 Speed gradient algorithms
We now provide a brief introduction to a class of adaptive control methods known as speed gradient algorithms (A. L. Fradkov, 1999; Fradkov, 1979; Ortega, 1996). Speed gradient algorithms are applicable to nonlinearly parameterized systems that satisfy a convexity requirement described in Assumption 8. In their most basic form, these algorithms are specified by a “local” goal functional . is required to satisfy three main assumptions.
is non-negative and radially unbounded: for all , and when . is also uniformly continuous in whenever is bounded.
There exists an ideal set of control parameters such that the origin of the system (1) is globally asymptotically stable when the control is evaluated at . Furthermore, is a Lyapunov function for the system when the control is evaluated at . That is, there exists a strictly increasing function such that with .
The time-derivative of is convex in the control parameters . The first-order condition for convexity
must be satisfied for all .
where is a positive definite matrix of learning rates. Its properties are summarized in the following proposition (A. L. Fradkov, 1999).
The proof follows by consideration of the Lyapunov function . Intuitively, while the goal functional may only depend on the control parameters indirectly through , its time derivative will depend explicitly on through . The adaptation law (9) ensures that moves in a direction to decrease . Under the conditions specified by Assumptions 2.2-8, this causes to be negative for long enough to accomplish the desired goal (A. L. Fradkov, 1999).
If is chosen so that depends on only through and is linearly parameterized, then Assumption 8 will immediately be satisfied by convexity of affine functions. Indeed, consider defining the goal functional for system (1). It is clear that this proposed goal functional satisfies Assumptions 2.2 and 2.3111Strictly speaking, does not have to be uniformly continuous in for bounded . This is a technical condition useful for the proof, but in many cases stability may still be shown via other means.. Then , and (9) exactly recovers the Slotine and Li controller (7), originally derived based on Lyapunov considerations222Note that is still convex in despite the fact that may change sign because is linear in by assumption.. In this sense, speed gradient algorithms represent a flexible class of methods that contain as particular cases some pre-existing approaches.
Rather than a local functional, one may instead specify an integral goal functional of the form . In this case, (9) takes the form
is a non-negative function: for all .
There exists an ideal set of controller parameters and a scalar function such that , , and for all .
Integral functionals allow the specification of a control goal that depends on all past data and that may have an explicit dependence on . is chosen so that it does not depend on the structure of the dynamics. Local functionals, on the other hand, result in adaptation laws that do have an explicit dependence on the dynamics through the appearance of the term in .
Integral functionals can be particularly useful if implies the desired control goal, and in this work, we will generally focus on the setting where is the function approximation error. Goal functionals can also be written as a sum of local and integral functionals with similar guarantees, and these approaches will lead to composite algorithms in the subsequent sections; the interested reader is referred to A. L. Fradkov (1999), Chapter 3 for more details.
2.3 The Bregman Lagrangian and accelerated optimization algorithms
Beginning with the seminal paper of Su . (2016)
, there has been a recent revival of interest in the analysis of optimization algorithms as discretizations of continuous-time ordinary differential equations. From this angle, the analysis of an optimization algorithm can be broken into two steps: first, an understanding of the quantitative convergence rates of the continuous-time differential equation, and second, a search over numerical discretization techniques that preserve these convergence rates. InWibisono . (2016), the Bregman Lagrangian was shown to generate a suite of accelerated optimization algorithms by appealing to the Euler Lagrange equations through the principle of least action. In its original form, the Bregman Lagrangian is given by
where is a distance generating function that may be taken to be the Euclidean norm in . is the loss function, and is the Bregman Divergence
The quantities , and are arbitrary time-dependent functions that ultimately set the damping and learning rates for the resulting algorithms. The Bregman Divergence may be understand as the discrepancy between evaluated at two points and the value predicted by a first-order Taylor expansion. It is guaranteed to be non-negative for convex functions by the gradient inequality (8).
To generate accelerated optimization algorithms, Wibisono, Wilson, and Jordan required two ideal scaling conditions: and . These conditions come out of the Euler Lagrange equations, where they are used to eliminate an unwanted term, and a Lyapunov argument, where they are used to ensure negativity of a chosen Lyapunov function.
Gaudio . (2019) recently utilized the Bregman Lagrangian to derive a momentum-like adaptive control algorithm. To do so, they defined , , and 333Note that these conditions validate the second ideal scaling condition but not the first. As mentioned above, the second ideal scaling condition is required only by the choice of Lyapunov function in the original work, which was used to derive convergence rates for optimization algorithms (Wibisono ., 2016). In this sense, they are not required for adaptive control.. Here, and
are non-negative scalar hyperparameters andis a context-dependent normalizing signal. With these definitions, and in the setting presented in Sec. 2.1444The authors in Gaudio . (2019) consider a linear dynamical system, where the form of (12) differs slightly but identical considerations apply., (11) reduces to
Comparing (11) and (12), it is clear that the loss function in (11) has been replaced by in (12). Following Remark 2.4, this is precisely the speed gradient functional that gives rise to the Slotine and Li controller. For (12), the Euler-Lagrange equations lead to the adaptation law
(13) may be understood as a higher-order version of the Slotine and Li adaptive controller. It may also be re-written as two first-order systems
The proof follows by consideration of the Lyapunov function (Gaudio ., 2019).
While this transformation to a system of two first-order systems may seem somewhat ad-hoc, it is readily apparent by use of the Bregman Hamiltonian
which, via Hamilton’s equations, leads to
Defining leads immediately to (14) & (15). As is typical in classical mechanics, the Bregman Hamiltonian may be obtained from a Legendre transform of the Bregman Lagrangian. The Hamiltonian equations may be useful for discrete-time algorithm development through application of symplectic discretization techniques (Betancourt ., 2018; França ., 2019; Shi ., 2019).
2.4 Reduced-order observers and proportional-integral adaptation laws
The reduced-order Luenberger observer (Luenberger, 1979) is a key tool in linear systems theory. Consider an -dimensional completely observable dynamical system
where is the state, , , is the system output, and the and matrices are known. Define an auxiliary variable
where in (17) is an arbitrary design matrix. Then obeys the dynamics
where it is clear that all quantities in (18) are known except for itself. Constructing an observer of identical form
ensures that the error obeys the dynamics
Because the original system is completely observable, can be selected by choice of
to have arbitrary eigenvalues by the pole placement theorem. The original state variables can then be reconstructed as.
While is unknown through its dependence on , itself is known. Intuitively, unknown quantities contained in can thus be obtained in the observer dynamics through a proportional term containing . Similar concepts can be extended to nonlinear observers; see Lohmiller Slotine (1998), Sec. 4.1. This idea of gaining a “free” derivative has also been used in adaptive control, with particular success when applied to nonlinear parameterizations. Proportional-integral adaptive laws of this type have been known as algorithms in finite form (A. L. Fradkov, 1999; IY. Tyukin, 2003) and appear in the well-known I&I framework (Astolfi Ortega, 2003; Liu ., 2010). This approach will be the basis for our algorithms for nonlinearly parameterized systems.
3 Main contributions
In this section, we present the main contributions of this work. We begin by noting that the Bregman Lagrangian generates higher-order speed gradient algorithms, of which the adaptation law (13) is a special case. We prove some general conditions under which these higher-order algorithms will achieve tracking. By analogy with integral speed gradient functionals, we derive a proportional-integral scheme to implement a first-order composite adaptation law (JJ. Slotine Li, 1991) driven directly by the function approximation error rather than its filtered version. We subsequently fuse the generating functional for the composite law with the Bregman Lagrangian to construct a higher-order composite algorithm.
Combining a connection between the techniques of isotonic regression (Kakade ., 2011; Goel Klivans, 2017; Goel ., 2018) and algorithms for monotone nonlinear parameterizations (Tyukin ., 2007; I. Tyukin, 2011), we demonstrate how to modify our higher-order speed gradient framework to derive higher-order algorithms for nonlinearly parameterized systems. We follow this development by discussing a new form of high-order algorithm inspired by the Elastic Averaging Stochastic Gradient Descent (EASGD) algorithm (Zhang ., 2014; Boffi Slotine, 2020) and extensions to distributed adaptation (Wensing Slotine, 2017). We conclude by demonstrating how to use time-varying learning rates based on the bounded gain forgetting technique with our presented algorithms (JJ. Slotine Li, 1991).
3.1 Accelerated speed gradient algorithms
As noted in Sec. 2.3, the Bregman Lagrangian (12) that generates the higher-order algorithm (13) contains the local speed gradient functional that gives rise to the Slotine and Li controller. Based on this observation, we define local and integral higher-order speed gradient algorithms via the Bregman Lagrangian. We begin with the local functional
which generates the higher-order law
Algorithm (19) can be re-written as two first-order systems
There exists a time-varying normalizing signal and non-negative scalar values such that the time-derivative of the goal functional evaluated at the true parameters, , satisfies the following inequality.
where is positive definite, uniformly continuous in , and satisfies .
With Assumption 3.1 in hand, we can state the following proposition.
Consider the Lyapunov-like function
Equation (23) implies that, with ,
By radial unboundedness of in , (23) & (24) show that remains bounded. Similarly, radial unboundedness of in and show that and remain bounded. Integrating (24) shows that , so that . An identical argument shows that . Now, because and are bounded, and because is locally bounded in and uniformly in by assumption, writing shows that is uniformly continuous in . Because is uniformly continuous when is bounded, and because is uniformly continuous in , is uniformly continuous in and by Barbalat’s lemma. This shows that . ∎
The uniform continuity assumptions on and used in the general setting handled by the proof of Prop. 3.1 are not strictly necessary. Without them, we can conclude existence of the integral but not that . In many cases, signal chasing arguments based on the finiteness of this integral are sufficient, as will be shown in the coming sections.
By taking in Prop. 3.1, we immediately recover Prop. 2.3555Following Remark 3.1, we conclude that , and by local boundedness of in and uniformly in , showing by Barbalat’s lemma.. In this sense, Prop. 3.1 elucidates the underlying structure exploited by the Bregman Lagrangian to generate higher-order algorithms.
We now consider the integral functional
which generates the higher-order law
We again re-write (25) as two first-order systems
We now require a modified version of Assumption 3.1.
is a non-negative function: for all , and . Furthermore, there exists a time-dependent normalizing signal and non-negative scalar values such that
for some constant .
With Assumption 3.2, we can state the following proposition.
Classically, Lyapunov functions used in adaptive control consist of a sum of tracking and parameter estimation error terms, with chosen to cancel a term of unknown sign. Several Lyapunov functions in this work consist only of parameter estimation error terms, such as (28). From a mathematical point of view, all that matters is that is negative semi-definite and contains signals related to the tracking error. Integrating allows the application of tools from functional analysis to ensure that the control goal is accomplished.
3.2 Composite adaptation laws
Here we consider the linearly parameterized setting , and derive new first- and second-order composite adaptation laws. Composite adaptation laws are driven by two sources of error: the tracking error itself, as summarized by in the Slotine and Li controller, and a prediction error generally obtained from an algebraic relation which is itself constructed by filtering the dynamics (JJ. Slotine Li, 1991). A starting point for our first proposed algorithm is to consider a hybrid local and integral speed gradient functional
where are positive design parameters weighting the contributions of each term. As discussed in Sec. 2.2, the first term leads to the Slotine and Li controller. The second can be clearly seen to satisfy Assumptions 2.5 and 2.6 with . It also satisfies Assumption 8, as is a quadratic function of for linear . Following the speed gradient formalism, the resulting adaptation law is given by
which is a composite adaptation law simultaneously driven by and the instantaneous function approximation error . Equation (31) depends on the function approximation error , which is not measured and hence cannot be used directly in an adaptation law. Nevertheless, it can be obtained through a PI law as discussed in Sec. 2.4. To do so, we define
More error signals may be used for additional terms in the adaptation law. For example, a prediction error obtained by filtering the dynamics may also be employed, leading to a three-term composite algorithm.
As mentioned in Sec. 2.1, for clarity of presentation we have restricted our discussion to the -order system (1). In general, the PI form (34) leads to unwanted unknown terms contained in in addition to the desired unknown term. In this case, the desired unknown term is while the undesired unknown term is . Indeed, the purpose of introducing the additional proportional term in (32) is to cancel the undesired unknown term. In general, cancellation of the undesired terms can be obtained by choosing to solve a PDE, and solutions to this PDE will only exist if the undesired term is the gradient of an auxiliary function. is then set to this auxiliary function. In some cases, the PDE can be avoided, such as through dynamic scaling techniques (Karagiannis ., 2009) or the similar embedding technique of Tyukin (I. Tyukin, 2011).
The properties of the adaptive law (31) may be summarized with the following proposition.
Consider the adaptation algorithm (31) with a linearly parameterized unknown, . Then all trajectories remain bounded, , , , and .
Following the accelerated speed gradient approach of Sec. 3.1, we now obtain a higher-order composite algorithm, and give a PI implementation. We again consider a hybrid local and integral speed gradient functional, so that (11) takes the form
where and are positive constants weighting the two error terms. The Euler-Lagrange equations then lead to the higher-order composite system
where now (38) is obtained through the PI form with , , and given by (32), (33), and (3.2) respectively with . The properties of the higher-order composite adaptation law (37) are stated in the following proposition.