1 Introduction
Ordinary differential equations relating the applied input forces to the system state variables are commonly employed in modeling structural dynamic systems. Direct measurements of the system state variables are not always possible; instead, noisy observations as a function of the system state variables are obtained at discrete time instants. State estimation addresses the problem of obtaining optimal estimates of the unobserved system states from noisy measurements given the knowledge of the system model and the dynamic input forces. This problem is at the core of many engineering applications ranging from condition monitoring and prediction of stresses, to feedback control and performance evaluation. The Kalman filter and its variants have been widely used for state estimation of a broad class of both linear and nonlinear dynamical systems [1, 2, 3]
. A significant challenge in dynamical state estimation arises due to inadequate knowledge of the inputs acting on the system. In many civil engineering systems such as buildings or bridges, measuring the operational external forces (wind loads, vehicular loads, etc.) is not practical. A commonly adopted approach in such cases of unmeasured input forces is to assume that the input force is a zero mean white noise process and then apply filtering techniques for state estimation; however, in many cases the actual inputs may violate this white noise assumption and could potentially yield poor estimation results. Thus, there is a need to inversely determine the unmeasured input forces. Input estimation, also known as force identification, involves reconstructing the latent or unmeasured dynamic input forces given a limited number of sensor measurements and knowledge of the system model —obtained either from a finite element model or inferred a priori using system identification methods. Cases where input estimation and state estimation are performed together are referred to as joint inputstate estimation.
Originally, force identification problems were treated separately from state estimation problems, and the earliest algorithms used frequency response functions to inversely compute the unknown forces [4, 5]. Subsequently, a wide variety of force identification methods were proposed, both in timedomain and frequencydomain (see [6] for a recent survey on force reconstruction techniques). Among timedomain approaches, two distinct modelling frameworks exist: deterministic [7, 8] and stochastic [9, 10, 11]
. The stochastic timedomain approaches differ from their deterministic counterparts in the manner that uncertainty in the form of noise is incorporated in the system and measurement model, and the state and/or input estimates so obtained under stochastic framework are characterized by probability distributions.
Recently, there has been a considerable amount of interest in developing and applying stochastic joint inputstate estimation algorithms. Gillijns and De Moor [12]
proposed a minimumvariance unbiased filter for joint input and state estimation of linear systems without a direct transmission term, where the input estimation and state estimation are performed in two independent, sequential steps. This algorithm was extended to include the direct transmission term
[13] which was later applied to structural dynamics by Lourens et al. [14] for use with reducedorder models (i.e., models obtained using a relatively small number of modes). Lourens et al. [15]also proposed an augmented Kalman filter (AKF) that appends the unknown forces to the state vector to jointly estimate the dynamic forces and system states using a classical Kalman filter. This method results in spurious low frequency components in the force and displacement estimates when only acceleration measurements are used. An analytical investigation performed in
[16] showed that the augmented statespace formulation of AKF is susceptible to numerical instability and unobservability issues when only acceleration measurements are used. In an effort to overcome these shortcomings, Naets et al. [16] proposed the use of artificial displacement measurements to stabilize systems with only acceleration measurements. This algorithm, known as AKF with dummy measurements (AKFdm), is similar to the one proposed by Chatzi and Fuggini [17] for alleviating the low frequency drifts in displacement estimates needed for structural monitoring purposes. Azam et al. [18] proposed a dual Kalman filter (DKF) for state and input estimation using only acceleration measurements. Unlike the AKF formulation where the input and states are jointly predicted and updated, the DKF follows a successive structure of implementation, in that, the input prediction and update are followed by the state prediction and update. Due to the separation of the estimation process in two successive steps, the DKF typically overcomes the issue of unobservability. Maes et al. [19] proposed a modified AKF which directly accounts for the correlation between the process noise and the measurement noise.An attractive feature of the above Kalman filteringbased joint inputstate estimation methods lies in their amenability for online implementation. That said, their accuracy in force reconstruction critically depends on prior tuning of the covariance matrices associated with the unknown inputs. For example, in the AKF and DKF algorithms, the covariance matrix for the unknown input is tuned while in the AKFdm, an additional covariance matrix for the dummy displacements must be tuned in addition to the input covariance. In the absence of expert knowledge of the covariance matrices, a trialanderror approach or a heuristic rule of thumb is often adopted to tune the input covariance matrices prior to implementation. This may not be possible when there is very little prior knowledge about the input and/or the states of the system. The tacit approach followed in most cases is to have an offline
tuning phase —where some response data collected over a certain time duration is used to obtain optimal values of the covariance matrices —prior to the online estimation phase with Kalman filtering.In 2009, Alvarez et al. [20] proposed a Bayesian learning technique termed latent force model (LFM), that allows flexible modelling of unknown inputs (or latent forces) in a system. The unknown or latent forces of a system are modelled using nonparametric Gaussian process (GP) model [21] with some chosen covariance functions. The choice of covariance function provides flexibility to incorporate generic prior knowledge about the behavior of the inputs (e.g. random, periodic, smoothly varying). A Gaussian process based LFM, GPLFM in short, is then constructed incorporating the physicsbased mathematical model of the system (e.g. differential equations of the system) within the GP covariance functions to infer inputs to the system. The behavior of the latent forces depend on the parameters (also called hyperparameters) of the chosen GP covariance functions which constitute the tunable parameters in this model. These hyperparameters can be set based on expert knowledge, or they can be tuned by optimization using observed data. Hartikainen and Särkkä [22] and more recently Särkkä et al. [23] advocated a statespace approach to GPLFM inference wherein all information about the system and the GP force model is fully captured in an augmented statespace representation, thus making the inference amenable to use with Bayesian filtering and smoothing methods. Once the hyperparameters of GPLFM are computed, a GPLFM can be adapted for online implementation with Kalman filter.
This paper introduces GPLFM for joint input and state estimation of linear timeinvariant structural systems. To the authors’ knowledge, this is the first application of GPLFM in the domain of structural dynamics. Additionally, the following are the key contributions of this paper:

Unlike the AKF, the GPLFM formulation is analytically proved to be observable and is shown to be robust against drift in force estimation even when used with only acceleration measurements.

A maximum likelihood optimization based on the observed data is used to compute the optimal hyperparameters of the GPLFM, which eliminates the need for tuning the parameters heuristically or through trialanderror.
An indepth numerical study is also presented comparing the performance of the proposed GPLFM algorithm with the existing AKF, AKFdm, and DKF algorithms.
The paper is organized as follows. Section 2 presents the mathematical model of the structure and outlines the objective of this study. Next, a concise background on GPLFM and its use in regression is provided in Section 3. In Section 4, the GPLFM is formulated into a linear statespace model, and a procedure for joint posterior inference on the inputs and states is discussed. Section 4.5 presents GPLFM as a generalization of the augmented statespace models used in joint inputstate estimation and provides proofs of observability and stability of the GPLFM. Section 6 encompasses numerical studies using a 10dof shear building, illustrating the performance of GPLFM under different excitation and measurement scenarios. The paper is finally concluded with an application of the proposed method on a 76storey ASCE benchmark office tower for joint input and state estimation.
2 Problem Formulation
2.1 Mathematical model of structural system
The equation of motion of a degreesoffreedom (dofs) linear structural system under forced excitation including ground motions can be represented by the following second order ordinary differential equation (obtained after discretization in space):
(1) 
where, is the vector of displacements at the degrees of freedom and , and represent the mass, damping, and stiffness matrices of the structural system, respectively. The external forces acting on the structure are represented by a combination of the external loads acting on the structure and the forces generated due to earthquake ground motion . The first term denotes the product of a load influence matrix and the vector representing the external load time histories while the second term represents the forces generated in the structure due to earthquake ground motion where is the matrix of influence coefficients which specifies the dofs affected by the ground motion. In cases where the size of the structural model is very large, it is common to use modal reducedorder models.
Defining a state vector and a force vector as
(2) 
where and ; and , the set of secondorder differential equations in Equation 1 can be converted to a set of firstorder differential equations called as the continuoustime statespace equation
(3) 
where the continuoustime system matrices and are defined as
(4) 
where
is the identity matrix of appropriate dimension. Consider next the measurement equation, where it is assumed that a combination of displacements, velocities and accelerations can be measured. Hence the output vector
, containing measured quantities, assumes the following form(5) 
where, , and are the selection matrices for displacements, velocities and accelerations, respectively. Using Equation 1 and the defined state and force vectors, Equation 5 can be written in the statespace form as
(6) 
where, the output influence matrix and direct transmission matrix are
(7) 
Note that the second column of the direct transmission term corresponds to the earthquake ground motion input and takes zero values as the direct contribution of earthquake ground motion vanishes when measuring absolute accelerations (see derivation in D)
Equations 3 and 6 form the continuoustime statespace model (SSM) for the system described by Equation 1. In practical applications, continuous time outputs are of course not observed, instead noisy measurements are obtained via sampling of system responses at discrete time instants. Assuming a sampling interval of , discrete time instances are defined at for , and a discrete measurement model can be expressed as
(8) 
where is the measurement noise vector. Combined together, Equations 3 and 8 represent a SSM with continuoustime dynamics and discretetime measurements known as the continuousdiscrete statespace model (see [24], pg170 or [25], pg93)
(9) 
For numerical implementation, the continuoustime statespace form in Equations 3 and 6 is converted to the discretetime statespace following the zeroorderhold assumption
(10) 
where , , , . The terms and are the process and the measurement noise vectors, typically added to to account for modelling errors and measurement errors, respectively. They are assumed zeromean Gaussian white noise with covariances
(11) 
with and .
2.2 Objective
Having elaborated on the mathematical model of structure of interest, attention is now focused on the problem of stochastic joint input and state estimation. Here, both the inputs and the system states
are taken to be unknown sequence of Gaussian random variables, and the goal lies in inferring the joint posterior distribution of the inputs and states from a set of noisy measurements
given the a priori knowledge of the structural model parameters. In pursuit of this goal, the objective of this paper is to implement a Gaussian process latent force model which improves upon existing methods by reducing the dependency on manual tuning of covariance matrices associated with the unknown inputs and also provides numerical stability with respect to observability of the augmented statespace formulation when used with only acceleration measurements. The location of the inputs is assumed to be known, however studies on input localization can be found in [26, 27, 28].3 Background on Gaussian process latent force models
3.1 Brief overview of Gaussian processes
Gaussian processes (GPs) [29, 21, 30] are a class of stochastic processes that provide a paradigm for specifying probability distributions over functions. Gaussian processes have been widely studied and used in many different fields such as signal processing [31], geostatistics [32, 33] and inverse problems [30]. For instance, in geostatistics literature Gaussian process regression is known as Kriging. A Gaussian process is a generalization of the Gaussian probability distribution. While a probability distribution is defined over random variables which are scalars or vectors (for multivariate distributions), a stochastic process governs the properties of functions. Consider an independent variable and a function such that . Then a GP defined over with the mean and covariance function is as follows
(12) 
(13) 
where denotes the hyperparameters of the covariance function . The choice of the covariance function allows encoding any prior knowledge about (e.g., periodicity, linearity, smoothness), and can accommodate approximation of arbitrarily complex functions [21]. The notation in Equation 12
implies that any finite collection of function values has a joint multivariate Gaussian distribution, that is
, where is the mean vector and is the covariance matrix with for . If no prior information is available about the mean function, it is generally set to zero, i.e. . However, for the covariance function, any function that generates a positive, semidefinite, covariance matrix is a valid covariance function. For example, the following squared exponential covariance function has the form(14) 
where are the hyperparameters of the covariance function. The mean and covariance functions dictate how the random functions behave on average and how the different points in the input space vary with respect to each other. The covariance function thus encodes a correlation structure which introduces interdependencies between function values at different inputs.
3.2 Gaussian process regression
GPs are often used to solve regression problems [34, 35, 36]. In general, regression problems are concerned with the estimation of values of a dependent variable observed at certain values of an independent variable , given a set of noisy measurements . The relation can be written as
(15) 
where is noise. In other words, regression involves estimating the latent (unobserved) function that will enable prediction of at new values of .
One perspective of looking at GPs representing the inputoutput relation is called the functionspace view, given in Rasmussen and Williams [21]. Unlike traditional modelling approaches which rely on fitting a parameterized mathematical form to approximate the inputoutput relation , a GP does not assume any explicit form of , rather a prior belief (in the form of the mean function and the covariance function) is placed on the space of all possible functions
. GPs belongs to the class of ‘nonparametric’ models because the number of parameters in the model is not fixed, but rather determined by the number of data points. Upon data collection, the posterior distribution over
is updated according to Bayes’ rule. All the candidate functions represented by GPs can be used to express linear or nonlinear relationships between the input and the output .It is to be remarked at this point that the input dimension (dimension of the independent variable ) is usually greater than 1, however in this work, the primary focus is on doing inference using timeseries data where time is the only independent input variable. Thus , and is hereafter replaced by — referred to as the time domain. Examples where input dimension is greater that one () are situations where the input variables may represent time and onedimensional space (), or space in two dimensions (), or even time with twodimensional space ().
Now consider the prediction of the value of at a test point , based on a previously measured set of outputs which are corrupted with white Gaussian noise:
(16) 
Assume that is modelled as a Gaussian process with zero mean and a covariance function, that is
(17) 
where is the covariance function with hyperparameters
. Under the Gaussian process assumption, the joint distribution between
and is Gaussian, and can be expressed as(18) 
where is the vector consisting of time points, is a vector of measured outputs, is the covariance matrix comprising elements , and is a vector with elements .
Using the properties of multivariate Gaussian distributions, it can be shown that the posterior distribution of given the dataset and hyperparameters is also Gaussian
(19) 
with mean and variance
(20) 
3.3 Latent force models using Gaussian processes
Latent force models [20, 37] are a hybrid approach of modelling which combines mechanistic models (i.e. models based on physical laws) with nonparametric components such as GPs. Consider the following linear continuoustime statespace equations obtained from Equations 3 and 8
(21) 
with all components of the unknown input vector being modelled as zeromean independent Gaussian processes
(22) 
Here is the th component of , . The assumption of zero mean leads to no loss of generality. Note that the hyperparameters of the covariance function have been omitted for notational compactness. The model, given by Equations 21 and 22, leads to a (linear) latent force model, in which the basic idea is to combine the mechanistic statespace model with unknown (latent) forcing functions modelled by nonparametric Gaussian processes.
The solution of the state in the differential equation in Equation 21 can be obtained as
(23) 
where denotes the matrix exponential and is the initial condition of state at time . The state covariance matrix function can then be calculated as
(24) 
where is the prior covariance matrix of and is the joint covariance matrix of all latent forcing functions between time instants and
(25) 
is a diagonal matrix due the assumption of independence across all force components. Observe that the states form a multidimensional GP
(26) 
and thus both and are GPs. Next the measurement equation is rewritten as
(27) 
with
(28) 
Since the linear combination of Gaussian processes is a Gaussian process, is a Gaussian process, and can be expressed as
(29) 
where
(30a)  
(30b)  
(30c) 
and and are defined by Equations 25 and 24 respectively. A derivation for can be found in C. Thus, one can replace the problem defined by Equations 21, and 22 with an equivalent Gaussian process regression problem
(31) 
This approach, introduced by Alvarez et al. [20, 37] reduces the latent force model in Equation 21 and 22 to a GP regression problem as expressed by Equation 31. However, the approach requires analytical computation of the covariance matrix functions , which may not always be obtained in a closed form and may need computation via numerical integration. In the next section, a formulation to convert covariance functions to statespace form for a certain class of GP covariance functions is considered. In that formulation, numerical integration is avoided altogether and replaced with matrix exponential computations and solutions to Lyapunov equations.
4 GPLFM as stochastic SSMs for sequential inference
This section discusses how temporal GPs with a certain class of covariance functions can be reformulated as SSMs and demonstrates how they can be combined with linear LFMs for sequential posterior inference of latent inputs and states. It must be emphasized at this point that only stationary covariance functions which have the form , where , are considered in this work.
4.1 State space representation of temporal GPs
Consider a scalarvalued GP with stationary covariance function as follows
(32) 
It has been shown [38, 22, 39, 40] that can be represented as the output of a scalar linear timeinvariant system driven by white noise
(33) 
where are known constants and is a white noise process with spectral density . Equation 33 can be written in statespace form
(34) 
where the state and the matrices , and are given by:
(35) 
The discretetime form of the LTI model in Equation 34 is given by
(36) 
where , and . This formulation allows GP regression problems to be solved sequentially with Kalman filtering and smoothing techniques. The iteration can be started with initial state . Note that the model (and the corresponding covariance function) is stationary, and therefore has a steady state solution distributed as , where the steady state covariance matrix can be obtained as the solution of continuoustime Riccati solution
(37) 
Since steady state is invariant with respect to time, the initial covariance matrix is set equal to the steady state covariance, .
Using the SSM given by Equation 34, the power spectral density of can be computed as
(38) 
The stationary covariance function for
is related to its spectral density through the inverse Fourier transform
(39) 
In terms of the statespace matrices, can be calculated using [41]
(40) 
where .
Hartikainen and Särkkä [38] showed that it is possible to form , and such that has the desired covariance function only if the spectral density of has a proper rational form
(41) 
By applying spectral factorization, the spectral density can be expressed similar to Equation 38
(42) 
where is a stable rational transfer function – has poles in the left half plane. The procedure to convert a covariance function into SSM is outlined as follows [39]:

The spectral density is computed by taking Fourier transform of

Finally, the transfer function model is converted into an equivalent SSM driven by a scalarvalued white noise process with spectral density
Many GPs having stationary covariance functions can be expressed into statespace form as in Equation 36 with model matrices defined by , , , and . It should be noted that nonstationary covariance functions can also be converted into statespace forms [42], however the Fourier transform approach is no longer applicable.
4.2 GPLFM in joint statespace form
Consider once again the latent force model given by Equations 21 and 22, rewritten for convenience,
(43) 
where each component, , of the latent force vector is modelled as a GP. For each component, given a stationary covariance function with hyperparameters , a statespace representation can be constructed using the above procedure. The statespace representation for the th component of , is given by
(44) 
where is the state vector for th latent force, is a zero mean Gaussian process with spectral density , and , , are the statespace matrices with being the dimension of the state vector , . Formally, the linear timeinvariant system represented in Equation 44 is a stochastic differential equation where represents a Brownian motion, and the mathematical description is given via stochastic integrals (see e.g. [24] for an account of this). The statespace form of the LFM can be obtained by combining the component GP SSMs with the system SSM to yield the following augmented model:
(45) 
where are the columns of matrix and are the columns of matrix and vector , is a vectorvalued Gaussian process having spectral density , . Equation 45 can be written using block matrices as follows:
(46) 
where
(47) 
In shorthand notation, the augmented SSM as shown in Equation 46 can be represented by
(48) 
Here is the augmented state vector, is a concatenated vectorvalued Gaussian process with spectral density , as shown
(49) 
and matrices , and where .
4.3 Joint posterior inference of inputs and states
The discretetime form of the augmented SSM (Equation 48) can be obtained as
(50) 
where the statespace matrices and . is a zeromean Gaussian white noise vector representing the discretetime form of whose covariance is given by
(51) 
where is the matrix exponential of the statetransition matrix. The integral in Equation 51 is solved using matrix fraction decomposition (see [43] for implementation details). Zeromean Gaussian white noise with covariance can be added to the process model in Equation 50 to account for unmodelled dynamics. The resulting modified form of Equation 50 can be written as
(52) 
where is the modified white noise vector with modified covariance as defined below:
(53) 
and are assumed to be uncorrelated to each other and their joint covariance is expressed through Equation 11.
The posterior distribution of the states and the forces can be estimated with the classical Kalman filter (and smoother). Here, is a vector comprising all covariance function hyperparameters i.e. . The estimation is started from a Gaussian noise prior where the covariance matrix has the following block diagonal form
(54) 
is the initial covariance matrix for the nonaugmented state vector chosen according to some prior knowledge. The covariance matrix for the th latent force is set equal to the steady state covariance matrix obtained using Equation 37, i.e. , .
4.4 Hyperparameter optimization for GP covariance functions
The structural model parameter matrices and that of the covariance functions corresponding to latent forces are transformed into parameters of the augmented GPLFM statespace model. As already stated in Section 2.2, the knowledge of the structural system parameters is assumed to be known a priori (either from FE model or from system identification), and therefore the augmented state estimation results will depend only on the parameters of the chosen covariance functions for the latent force components. In general, a parametric family of covariance functions is chosen and the hyperparameters are optimized based on the measurement data. Typical hyperparameters include lengthscale and signal variance for a standard family of covariance functions. The hyperparameters can be estimated in different ways, including maximization of marginal likelihood [21], maximum a posteriori [44]
, and Markov Chain Monte Carlo methods
[45]. In this study, the optimized hyperparameters are obtained by maximizing the likelihood function based on the measurements. Maximum likelihood estimates of the hyperparameters (i.e. signal variance and lengthscale) of the covariance function(s) can be obtained by minimizing the negative loglikelihood (or maximizing the loglikelihood) of the measurements as follows:(55) 
Using Kalman filter recursions (see B), the probabilities can be obtained as and where is the estimate of the initial augmented state vector and is the th predicted augmented state vector. The minimization of the negative loglikelihood of the measurements can then be expressed in terms of the innovations and the innovation covariance (see [3] for more details) as:
(56) 
The expressions for and are provided in Equations 77 and 78 respectively. The minimization can be done using optimization tools such as MATLAB’s builtin functions fminunc or fmincon. It is noteworthy to mention that maximum likelihood optimization may get stuck in a local minimum; to avoid this one may need to start the optimization from different initial points. An algorithm depicting the steps involved in the proposed algorithm is shown in Algorithm 1.
Comments
There are no comments yet.