Structural health monitoring (SHM) is a complex field of study making use of statistics and robust optimisation in order to allow for damage detection, localization and estimation, through the use of sensing systems. The potential for life-safety and economic benefits has motivated the needs for SHM research, facilitating a shift from time-based to condition-based maintenance.
Traditionally, vibration-based techniques have been used extensively in SHM for damage identification . The time-based response of a structure can be measured by sensors such as accelerometers or strain gauges. Traditional SHM approaches adopt a numerical model, and a physical model of the structure and attempts to relate any differences between the measured data and the data generated by the model as damage identification . However, a numerical model is not always available in practice and does not always correctly capture the exact behaviour of the real structure. By using statistical and data-driven approaches it is possible to learn a model with confidence bounds from measured data, leading to a more flexible approach to SHM damage identification [1, 3].
The key underlying problem in SHM is identifying damage, and it is classified by Rytter into four different levels of complexity
: damage detection , damage localization, damage severity assessment and failure prediction (remaining life estimation). Typically, level 4 necessitates the knowledge of domain-based knowledge of the characteristics of the system and its damage progression. Machine learning methods are typically sufficient to address levels from 1 to 3, with level 1 being traditionally solvable by using an unsupervised learning scheme, while levels 2 and 3 usually necessitate a supervised learning approach[3, 5, 6]. However recently there have been attempts to approach sub-problem 1 in a supervised manner through the generation of artificial negative data, in order to give more robust bounds for previous unsupervised learning approaches. 
Thus clearly, SHM is a field which benefits many industries, and as a result a recent trend in the aviation industry has seen a significant push for the implementation of machine learning techniques in manufacturing, operations and customer satisfaction.These techniques present the potential for predictive maintenance, prognostic component monitoring and aircraft health monitoring, all of which could dramatically reduce the cost of delays and unexpected events on the aviation industry. Accompanying this financial motive, these techniques present the opportunity for an industry wide increase in aircraft operational safety. 
Many major stakeholders in commercial aircraft manufacturing now offer a data based health monitoring system as part of the support package for many components. Airframe manufacturers have extended the scope of their round the clock Aircraft on Ground (AOG) support desks to include the function of a health monitoring system. On their new generation aircraft (A350 & A380) Airbus offer Aircraft Real Time Health Monitoring as part of their support service. This involves anomaly detection algorithms, classification algorithms and prognostic trend prediction algorithms to monitor the health of an aircraft before departure, in flight and post arrival. This service is being developed into a new product marketed as Skywise in which the algorithms are expanded to operate more autonomously and over a larger scope of the aircraft’s operation. A similar all encompassing service is offered by Boeing on their new generation aircraft . Supplemental services are also now offered by the larger suppliers specific to their components. Prognostic algorithms are particularly focused on by the engine manufacturers which aim to predict the remaining useful life of components which operate in the extreme environment of the power plant. These services are being developed across all of the major engine manufacturers including Rolls Royce, Pratt and Whitney and General Electric [11, 12, 13]. It is important to note that a majority of the functions of these systems are in the process of maturing and only once a critical amount of operational data is provided to them will they become robust. The demand for training data is being met with attempts by the manufacturers to enter non disclosure agreements with the airlines allowing companies like Airbus and Boeing access to airline operational data. However some resistance to this has been experienced from the airlines who have concerns regarding privacy and loss of competitive edge .
Although the methods required for addressing SHM problems are well understood, the practical considerations behind SHM are often neglected. In particular it is often required to pool information from multiple sensors in a robust manner, in order to maximise the ability for detecting damage. One such method for sensor fusion may be performed via tensor analysis, which has been successfully applied for feature extraction and data fusion in many application domains including but not limited to chemistry, neuroscience, social network analysis and computer vision[15, 16]. Prada et al. 
have previously used a three-way analysis of SHM data for damage detection and feature selection. However, this work was only studied to achieve damage detection, not damage localization and estimation. In, Khoa et al. proposed the use of a tensor analysis approach for damage identification in SHM through the use of CANonical DECOMPosition (CANDECOMP) using PARAllel FACtors (PARAFAC) analysis (CP) decomposition. The CP decomposition has been shown to achieve fast convergence through the Alternating Least Squares algorithm , and is the tensor decomposition approach that will be used in this paper.
2 Background Theory
2.1 Support Vector Machines and Anomaly Detection
Consider a set of training data enumerated as follows:
where each is from a non empty set known as the domain, and the data points are known as the target variables, which can take values of either -1, or 1. In binary classification the aim is to learn a decision function, . However in the case of anomaly detection which is the typical setting in SHM, we usually only have access to , and so the problem often becomes repurposed to finding an
which takes value +1 in a particular region in space, and -1 elsewhere. Mathematically we solve what is known as the one-class support vector machine (SVM) problem, described by Scholkopf et al. as follows:
where refers to the -th slack variable, , represents a mapping of the variable to a different feature space, refers to the weights, is some dot product space in the image of , and the hyper-parameter
, assists in characterising the solution in two ways: it is an upper bound on the fraction of allowed outliers, and is a lower bound on the number of support vectors[21, 20]. Through this optimisation problem, the SVM approach can be considered to be a maximum-margin classifier, in that the distance from the , and training pairs to the SVM boundary are maximised. Due to this property, the SVM boundary is known to be robust to new examples.
It is also typical for SVMs to employ the use of a kernel, due to the kernel trick, since it allows the dot product of two training domain examples in the projected space, to be computed efficiently through the evaluation of kernel function which corresponds to the particular projection () function. That is, In this paper, the kernel used in all SVM calculations is the radial basis function (RBF) kernel:
where , refers to the length-scale of the RBF kernel. Due to its formulation, and corresponding length-scale measure the RBF kernel has a good interpretation as a similarity metric (since the RBF kernel value decreases exponentially the further away the two points , and are, and the variable controls this rate of decrease). With the one-class SVM used in this paper we aim to tune two hyper-parameters: and .
2.2 Tensor Analysis for SHM Data
In SHM applications, there are usually many sensors at different locations used to measure the vibration signals over time. In this case, the incoming data can be represented as a three-way tensor with dimensions of , as described in Figure 1. The label, Feature, refers to the information extracted from the raw signals in the time domain (for example, features from a frequency domain). Location represents the relative sensor positioning, and time refers to data snapshots at different time stamps. 
Two typical approaches for tensor decomposition are the CANonical DECOMPosition (CANDECOMP) using PARAllel FACtors (PARAFAC) analysis (CP) decomposition and the Tucker decomposition . However, due to the ‘core tensor form’ of the Tucker decomposition, which is difficult to use and interpret, the CP form is used in this work.
The CP decomposition of a tensor factorizes the tensor as a sum of a finite number of rank-one tensors. In the case of a three-way tensor , it may be expressed as,
where is the latent factor, , and are -th columns of component matrices , and , and is the weight vector so that the columns of , , are normalized to length one. The symbol ‘’ represents a vector outer product, and is a three-way tensor containing the residuals. It is shown in element-wise as
It can also be written in term of the -th frontal slice of :
where is a diagonal matrix represented by , with means the ‘-th row of matrix ’. In order to obtain the matrices and , an algorithm known as alternating least squares (ALS) is often used. In order to perform this algorithm, the matrix (for example) is randomly initialised, and then updates are consecutively made to and as shown in Equation 2.2. 
where the symbol refers to the Khatri-Rao Product, which can be expressed as a column-wise Kronecker product.
where for clarity we define: and , resulting, , and where refers to the unfolding of tensor in mode .
The second benefit of using a CP decomposition approach lies in its flexibility to be used for online learning. In particular, Zhou et al.  have proposed a technique known as onlineCP, which allows the arrival of new data to be placed in -space. In this way once the SVM has been trained on data, there is no need to re-update the underlying model in time, all that needs to be done is place each new data point into -space. This procedure can be performed by considering that,
Here Zhou et al. make the observation that clearly is a minimiser for the first row of Equation 2.2, and thus in order to minimise the second row, we require . Thus the value of may be estimated by taking the psuedo-inverse of the matrix . This results in the incremental tensor update equation, which is used in this paper to simulate the arrival of new data inputs, which is shown in Equation 10.
3.1 The -DoF Lagrangian Aeroservoelastic Model
The use of aeroservoelastic (ASE) models has been well documented in aeronautical literature since the 1990’s. Both Noll and Baker et al. have noted that ASE models have increased in relevancy at a similar rate as aircraft have increased in size, due to the inevitable interactions between aerodynamic forces and structural dynamics [24, 25]. Noll further suggests that the increasing complexity of modern systems coupled with the highly flexible and light structures naturally necessitates ASE modelling . Typical ASE modeling considers the equations of motion being split into inertial, damping, and stiffness matrix terms, where the stiffness and damping matrices consist of both: structural and aerodynamic components [26, 27, 28]
. The wing itself is usually modeled as a flat plate with freedom in pitch and plunge (a pitch-plunge model) where the structural stiffness is modeled by restraining springs in each degree of freedom. Initially the equations of motion are expressed as a set of second order ordinary differential equations, but are then usually discretised into a state space time domain model when running simulations[26, 29]. The typical ASE equation can be expressed as in Equation 11.
where are the state variables, are the inertial, structural and aerodynamic damping, structural and aerodynamic terms, is the chord length of airfoil, is damping, is the velocity, and the system is driven by an external force . Usually the servo dynamics are included into these matrix equations so that this equation is not just aeroelastic, but aeroservoelastic.  Alternatively Pitt et. al. show how a basic -DoF model can be expanded to include the influence of the servo. The servo (actuator) commonly used is a hydraulic piston with a connected control rod [27, 30], and as such will be used in this paper. It is assumed that the actuator is inertia-less given that its mass is not significant relative to the order of magnitude of frequencies concerned with the wing model. The model used will be based on that developed previously by Wright et al.  It is widely commented that this solution is a good approximation of low damped modes [26, 27, 24].
The ASE model developed in this paper will be derived from an energy-based Lagrangian perspective. Structurally the wing considered is 2D and rectangular, and will consist of a set of control surfaces, of which each of the control surfaces will be attached to one hydraulic piston each, resulting in independent hydraulic pistons in total. The aerodynamics will be implemented by considering Theodorsen’s strip theory, with a quasi-steady aerodynamic assumption, where the quasi-steady assumption is defined as the limiting effect of oscillatory aerodynamics as the system frequency moves towards zero, which is consistent with the definition of Hancock et al. . This way, it is possible to capture the complexity of the state space in much more simpler form than one would otherwise obtain via pure kinetics, or through FE-model coupling. We begin by considering Equation 12 and 13:
where denotes the Lagrangian, refers to kinetic energy, refers to potential energy, is the -th general co-ordinate, where , refers to the dissipative force of the system , represents the generalised force acting on the system, and the dot notation refers to the time derivative. The states, which we consider for this system are the wing bending, , the wing twist , and the control surface deflections , where ,, where . That is . In this way there are degrees of freedom for this model, of which are , corresponding to a total of total control surfaces. When the actuator model is introduced the vector will become of size since there is only one actuator per control surface. The actuator state variable will be represented by the , and will refer to the pressure differential over the entire actuator. The -positions on the wing where the control surface begins and ends are given by a sequence of ordered numbers, , where . This is clarified in Figure 4.
As a result of this formulation, the coordinate of any point on top of the wing will be given by
so that refers to the closed interval . Here, the and variables refer to the co-ordinates of the wing as clarified in Figure 5, , refer the the flexural and control surface hinge axes, is the indicator function, refers to the number scheme used to segment the wing into a variety of regions as Figure 4 clarifies, and is the standard Lebesgue measure. Note that Equation 15 implies the following equation:
Now, differentiating Equation 14 with respect to time gives:
Equation 17 allows us to find the Lagrangian of the aeroservoelastic system, since , where , and is the material density of the wing, and where refers to the entire wing surface (note that the wing is approximated as a 2D surface rather than a 3D volume). In order to do so we make use of the generalised formulation for an arbitrary sequence of elements, , where , which is shown in Equation 18.
With the expression at hand we find that the kinetic energy of this -DoF system may be expressed by Equation 19.
where, is the Lebesgue measure. Since the integral is univariate, and the integrand is measurable (in particular Lebesgue measurable), the measure terms simply evaluate to the length of the interval. For example, Note that the lack of absolute value stems from the ordering of , since . In addition notice that a large part of the kinetic energy expression cancels out to zero, due to the presence of multiple indicator functions which cannot activate in unison. The system has been designed in this way due to the structural constraints developed previously (made clear in Equation 15. Also from some of the expressions in the inertia integrals we make clear that the density, and thickness, can be removed from the mass differential , making clear our assumptions about constant thickness and constant density along the wing. However we opt to leave in where possible from here onwards to highlight the general nature of this derivation.
Although the generalised kinetic energy term has been derived, we still must derive the potential energy of the structure. However this cannot be derived from first principles, and so basic assumptions will need to be made. In particular the potential energy we consider is shown in Equation 26,
which assumes independence between all the state variables in terms of structural stiffness. We also assume that each stiffness term, , can be obtained through the classic stiffness-frequency relationship: . That is we must specify the frequency terms for each state variable (which is system dependent), in order to back-calculate the corresponding stiffness terms, given that we know the expressions as a result of calculating the kinetic energy terms.
Up until this point, we have briefly outlined the derivation for the kinetic energy and structural potential energy terms for use in the Lagrangian equation. However currently it is only an -DoF elastic model, and not aeroelastic. Thus additional aerodynamics are required. The terms will be included via a generalised -DoF 2D strip theory approach with quasi-steady aerodynamic assumptions. The strip theory equations for the incremental lift, wing moment, and hinge moment expressions are outlined in Equations Eqs. 29, 28 and 27. These equations can be obtained from , but have been generalised here to consider the presence of multiple control surfaces.
Moreover the incremental work on the entire aeroelastic system can be defined as in Equation 30.
This is needed in order to obtain the final generalised aerodynamics forces for the system through: which can then be substituted into Equation 13.
In addition, the following Remark 1 is developed to aid in simplifying Equations Eqs. 29, 28 and 27. In particular it describes the way in which the terms can be broken down into simpler expressions. This is done to maintain the generality of developed equations, and for keeping expressions very succinct. However assuming the aforementioned ordering of points, and assuming that all the control surfaces are rectangular and lie along the hinge axis, then the reader can mentally replace every instance of with .
Remark 1 ().
The integral of the sum of discrete control surface angles over the wing, , can be expressed succinctly as a sum of Lebesgue measures. That is,
Lastly, the servo model used in this paper is based on one previously developed by Wright et al., which is linearised and massless.  The linearisation simplifies the overall dynamics, and the massless nature is just an assumption stemming from the idea that the natural frequency of the piston will be extremely minor when compared to that of the entire wing. In this ASE model in this paper, there is one hydraulic actuator per control surface. Exactly how the actuator assembles onto the pitch-plunge airfoil model is shown in Figure 8.
In particular, based on on Wright et al.  we have the following equation for each control surface:
where is a valve flow constant, and are the return and supply pressures, is the commanded deflection angle for the control surface, is a state variable referring to the pressure differential over the entire actuator, is the cross sectional area of the piston, is the area of the feedback spring, is the rate of change of the piston oil volume, is the lever arm ratio, is the bulk modulus of the oil, and is the distance orthogonal distance offset from the control surface to the piston, which enables kinematic relationships between the piston and control surface to be derived. Moreover in order to implement this model, consistent with Wright et al. we assume that . We also assume (as mentioned earlier) that the inertia of the piston in comparison with the overall wing dynamics is negligible. Details on how all the variables work together in the piston is shown in Figure 9. Thus, by combining all the previous information into a large set of linear equations we arrive at Equation 34, which is placed in Appendix A due to its size. All additional constants and parameters required for Equation 34 are available in Wright et al. .
3.2 Signal Processing for the -DoF Aeroservoelastic Model
Now that the -DoF ASE system has been formalised, it is possible to use it to extract artificial sensor data for the purpose of damage detection. In particular, the dynamical system described in Equation 34 can be rearranged into a state space form, which can then be used to simulate acceleration data from a sensor. The locations and numbering of the sensors used in this study have already been outlined in Figure 4. Example acceleration readings are shown in Figure 10. The acceleration readings can be obtained anywhere on top of the wing model by differentiating Equation 17
, and Gaussian white noise has been added to the readings in order to simulate the presence of atmospheric turbulence. Moreover, the magnitude of the response from sensor three is far larger than any other sensor, which can be explained when noting that sensor three is located on the control surface, and so has the most direct response to commanded input angles. Note that although the signals gathered are in the time domain, they are cleaned for noise and passed through a fast Fourier transform in order to transform it to frequency domain data. This is because the feature space for this paper is in the frequency domain,
Each of the events shown in Figure 10 correspond to a permutations of commanded input deflections of the control surfaces. There are many possible ways to generate this input permutation in order to generate the data, and some of those used in this paper are shown in Figure 13. More specifically Figure (a)a shows a case where input angles were generated equi-spaced over a regular grid across all angle size ranging from 8 to -8. However Figure (b)b shows another case study where the control surfaces were made to deflect by large amounts (), and the angles were selected by a Latin Hypercube Sampling (LHS) methodology. More information, on the use of LHS for the selection of points for aerospace structural system is shown in . Note also that emphasis is placed on the word permute rather than combination because the ordering of input angles is asymmetric. The control surface(s) closer to the wing root will invariable give smaller magnitude responses than those closer to the wing tip, due to the effect of cantilever bending.
Lastly, although using the -DoF aeroservoelastic system it is possible to simulate healthy, the purpose of this paper is to perform damage detection, and so it is necessary to also simulate the presence damage. There are many places where damage may occur on the wing, but the most obvious for this system is via damage in the actuator. The primary reason why a hydraulic actuation system may perform outside of its range of expected behaviour is due to the presence of pressure leakage. There are numerous reasons why pressure leakage may occur in a hydraulic actuation system. These include but are not limited to: gasket head damage, shaft sealant failure, leakage occurring from damaging hoses, faulty pumps. [35, 36] Therefore in this paper the presence of damage has been simulated by reducing the values of the supply pressure , and two sources of internal spring stiffness, , and in the actuator by 30% each.
4 Results and Discussion
Here we present results demonstrating the use of the CP decomposition tensor analysis approach, and how it can be used in conjunction with ASE models for damage detection.
4.1 Comparing Dimensionality
Healthy and unhealthy data were simulated for the -DoF ASE model, and the one class SVM was trained on the healthy data. The data used for clustering was based on a tensor decomposition of the -space, and all new incoming testing data (both healthy and unhealthy) were pre-processed by using the approximation for (Equation 10).
From Figure (a)a it would appear that there are two prominent clusters in -space. However upon zooming in we observe that there are in fact several clusters present. Each of these clusters is strongly influenced by one of the senors on the wing model. In particular the large cluster towards the top of Figure (a)a corresponds more strongly to sensor three. This is because sensor three is attached to the control surface and so will be influenced the most due to commanded input angles, as opposed to the other sensors which will only be reacting to the commanded input angle through nonlinear interaction terms. Important to note is that the commanded input angles used to generate this data is that from Figure (a)a, and so the input has a regular grid spacing, which is not entirely representative of random inputs that a conventional system may see. This issue will be explored in later sections.
Also as seen from Figure 16, a lot of the unhealthy data passes over the healthy cluster which will naturally effect the final model scores in a negative way. There will be a much higher occurrence of false positives and false negatives. However it is possible to increase the accuracy of the score by working in a higher dimensional -space. In particular we can alter the decomposition from up to , resulting in a 3D projection space. The result of this is shown in Figure 17.
We may see from Figure 17 that although the data in 2D space overlapped, and was inseparable for the one class SVM, in 3D space the data and clusters do become visibly separable, which is a result of how higher dimensional spaces tend to space-out points. The amount of overall improvement to the underlying accuracy (among other metrics) and statistics for the one class SVM model is made clear in Table 1.
for the confusion matrix.
As can be seen the scores increased form 0.52 up to 0.90, hinting towards a better model to use in practice. Moreover the confusion matrices have also changed, with the model having an order of magnitude less false positives, and half the amount of false negatives. These results imply that working in higher dimensional space is desirable, and in general for data which is not separable, it has been shown in vast amounts of kernel literature that projecting into a higher dimensional feature space does tend to make data more amenable to separation for clustering.  However care must be taken here, since we are projecting into a higher dimensional space using the tensor decomposition, and so it is possible to project up to arbitrarily large spaces, which will separate all points from one another. Moreover it is possible to introduce artificial sources of noise in these higher dimensions.  Therefore where possible it is preferred to look for methods which try to keep the dimensionality of the -space as low as possible in order to avoid such problems, and also to increase interpretability, since the 2D space is much simpler for future SHM engineers to interpret and evaluate.
4.2 Comparing Input Angle Magnitude
In the previous section, focus was placed on analysing how the difference in dimensionality of the -space may affect the results of a one class SVM classifier. Moreover the commanded angles input into the system were done so in a regular repeating grid structure. Here we explore how the differences in the input of the angles can affect the SVM model, and the -space.
The first difference here is, instead of inputting the angles as a regular grid (as shown in Figure (a)a), the inputs now come from a Latin Hypercube sample (LHS). This reflects a more realistic setting for the possibilities and combinations for potential angle inputs, since there is increase randomness in the inputs, which will decrease the ability for the CP decomposition to cleanly decompose the input tensor. The effect of now applying an LHS to the input angles is made clear in Figure 18.
It can be seen that all the clusters which were previously observable when the inputs were ordered have now been mixed, and are now difficult to differentiate. Moreover we notice that the unhealthy data is now extremely difficult to separate from the healthy training set, there is a lot of overlap between the data points. This naturally increases the quantity of false positives and false negatives, and also works to lower the overall -score. This is made clear in the first column of Table 2. In addition, it can be seen that the one class SVM boundary also tries to be overly inclusive and encompass all the healthy classes simultaneously owing to their close proximity to one another. This can make the training of SVMs difficult for use in aerospace structures since it appears that the -space for aerospace structures tends to naturally form clusters, and it is natural for SVMs to try and encompass everything. It is in fact very difficult to try and train a model that focuses individually on each structure, which is a negative aspect of using one class SVMs for aerospace structural modeling. This appears to be a different phenomenon as compared to previous literature which focused on civil structures, since civil structures tend to vibrate as one collective mass, and so different accelerometers will be experiencing similar readings. [7, 18] However in the aerospace field there are many active surfaces, which can move independently from one another, and also wings in particular experience a lot of bending and rotation during flight, especially closer towards the wing tips, which results in a -space that tends to naturally cluster.
In order to initially try and enforce these clusters to be amenable for one class SVM analysis, artificial negative data was generated in accordance with Cheema et al. 
. The is done so that the underlying one class problem can be treated equivalently as a two class problem with tighter boundaries. Cheema et al. have demonstrated this working well on civil structures by using kernel density estimates (KDE), Gaussian mixture models (GMM), and by local outlier factor (LOF) analysis. However it was reported by Cheema et al. that the KDE tends to work better due to its non-parametric nature, and so it was opted for here. It is non-parametric in that it estimates the probability density of data points via a summation of kernel functions. Equation 32 represents the generic form of an equi-weighted kernel density estimator :
In Equation 32, is the estimated density at the point , is the bandwidth which acts as a spacing factor, is the total number of points in the distribution with its division ultimately acting as a normaliser, is the dimensionality of the feature space, and is the kernel function. In this paper, a Gaussian kernel is used for negative sample generation, and its mathematical form is shown in Equation 33:
now can be interpreted as representing the standard deviation of the Gaussian components.
The results of using KDE to estimate a probability density over the training data in order to generate artificial outliers is shown in Figure 21.
From Figure (a)a it can be seen that KDE has successfully generated artificial negative training data in -space, and it has successfully done so in between the boundaries of the two clusters (where the more separate, larger cluster correlates strongly to sensor three, and every other sensor correlates strongly to the smaller, more dense cluster). However there are still many short comings of applying this fix to this problem, which can all be seen by observing Figure (b)b. Firstly there is still an overlap of density between the two clusters. Even though they visually appear separate, choosing the optimal length-scale for the Gaussian kernel in the KDE is difficult because smaller length scales will give highly overfit boundaries as the density peaks will be sharper, meaning that the KDE estimate will loose a lot of smoothness, which will in turn effect the boundary around the sparse points towards the outside of the cluster associated with sensor three. However for slightly larger length scales, the two clusters will being to overlap, meaning that the boundary will still encompass the two clusters. Thus in this case, because of the closeness of the two clusters, and the sparsity of the cluster associated with sensor three, it becomes a difficult balancing act to find a useful density fit. Alternate to a KDE estimation a GMM could be used, however once again looking at the density map of Figure (b)b, the non-parametric densities exhibit behaviour which are not completely Gaussian. The smaller cluster appears to have a fatter tail, and the larger cluster follows a strange, non-Gaussian shape. Thus it would appear that the adding randomness in the input data to simulate a more realistic scenario makes the underlying classification much more difficult.
However, although the intrinsic clusters between the individual sensors have been lost (apart from sensor three against all the other four sensors), it appears that we are able to recover more structure if only the large angles are considered as input. That is we instead consider LHS samples, ,. This corresponds to the inputs shown previously in Figure (b)b. Immediately we can see from Figure 24, that there now once again exists implicitly generated clusters, which aids much more in interpretability, and also model building. It appears that CP decomposition tends to mix the behaviour of clusters together a lot more if only small magnitude input angles are considered. However large physical inputs allows for better separation in -space between the individual cluster correlated with each sensor, and between sensor three against the other sensors. However, although we have gained the qualitative advantage of regrouping our data into clusters, it is still difficult for a one class SVM approach to work globally on this structure. This can be seen in Figure (b)b, where SVM still cannot globally work on a per cluster basis, and from the previous analysis on artificial negative data generation, the closeness of the clusters will make it difficult for density-based approaches to prevent an overlapping between the clusters. This can be seen in Table 2, where only a marginal increase in the scores are achieved. The purpose of the next section is to suggest that it is better to construct and individual one class SVM structure for each cluster, rather than a global model in order to improve predictive capability.
4.3 Considering Implicitly Generated Clusters
As the previous section has shown by considering the input space to consist only of large magnitude inputs, we are able to implicitly generate the clusters again. The reason for this is that low angles seem to introduce a lot of mixing effects in the CP decompositions, and physically speaking, larger angle inputs will result in larger accelerometer readings, allowing the signal pre-processing steps to be more effective in differentiating different events and sensors. However although different clusters were recovered, the one class SVM model was unable to construct and effective boundary around all the individual clusters, there by resulting in low metrics, such as the -scores (and naturally all other binary classification metrics).
As a result, instead of trying to construct one large, global one class SVM classifier, it was opted that instead multiple smaller classifiers be constructed around each individual cluster. In this way, the classification of new data points could be handled in any of number ways, for example by majority voting, or by weighted average between SVM boundaries, similar to the use of weighting coefficients for GMMs. Two example SVM boundaries are shown in Figure 27, where Figure (a)a refers to the cluster boundary strongly influenced by sensor three, and Figure (b)b refers to the cluster boundary strongly influenced by sensor five. In both cases the boundary appears to be qualitatively better than that shown previously in Figure 24, but as Table 3 makes clear due to the low data count, the false negatives and positives which occur dramatically reduce the -score of sensor three, when compared against sensor five. Moreover using more data will not help fix this discrepancy, due to the physics behind sensor five’s location. It lies half way long the chord and span, and so receives the smallest overall influence for wing twisting and bending. Therefore it can struggle to differentiate damage signals from healthy signals if the inputs are on the lower-end of the large magnitudes, or if the deflections of control surfaces result in an approximation modal phase cancellation at this centre point on the wing. Sensor five on the other experiences a large, direct influence from the commanded deflection angles resulting in its more robust boundary, and overall better score. However, as Table 3 makes clear, it is possible to project sensor three into a higher dimensional space to increase the degree of separation between points, thereby giving better scores, as suggested in the previous section, Comparing Dimensionality, it is preferable to try and avoid it if possible, due to the ability to introduce artificial noise if we project into too high of a dimension and are not careful (the reader is urged to read Bro & Kiers to see if this may happen in their context, since notions of rank are not properly understand in tensor mathematics ), and because for future SHM engineers models become much simpler to interpret if we can keep systems in a 2D space if possible, which according to this study is possible for ASE models.
The field of SHM is extremely pertinent across all engineering disciplines. SHM has been used successfully in the civil engineering field, and is currently in the phase of being slowly adopted by the aerospace field. Many complications do exist for the aerospace field, owing to the incredibly stringent factors of safety for aircraft, and due to a large difference in operating conditions for aircraft as opposed to civil structures, meaning that prior assumptions and algorithms developed in the civil SHM field need to be re-worked or re-modeled to work in aerospace. This was demonstrated by the construction of a Lagrangian -DoF ASE model, which was derived from first principles in this paper, and subsequently used to study the ability for tensor, and one class SVM approaches to be used for SHM. It was found that certain assumptions in the civil SHM field do not readily apply to aerospace, namely due to the presence of active control surfaces, and the possibilities for larger bending and torsion modes to occur which result in multiple clusters for healthy and un-healthy data. However by projecting the clusters either into high-dimensional space, or by demanding large angle inputs and clustering these implicitly generated structure individually, it was found that standards in civil engineering were still able to be adopted to the aerospace engineering field. This demonstrates an exciting future for this field, one that proves to be very fruitful in terms of research, and subsequently for industry.
- Farrar and Worden  Farrar, C. R., and Worden, K., “An introduction to structural health monitoring,” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 365, No. 1851, 2007, pp. 303–315.
- Alamdari  Alamdari, M. M., “Vibration-based Structural Health Monitoring,” Ph.D. thesis, University of Technology Sydney, 2015.
- Worden and Manson  Worden, K., and Manson, G., “The application of machine learning to structural health monitoring,” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 365, No. 1851, 2007, pp. 515–537.
- Rytter  Rytter, A., “Vibration-based inspection of civil engineering structures,” Ph.D. thesis, University of Aalborg, Denmark, 1993.
- Worden et al.  Worden, K., Manson, G., and Fieller, N., “Damage detection using outlier analysis,” Journal of Sound and Vibration, Vol. 229, No. 3, 2000, pp. 647–667.
Chan et al. 
Chan, T. H., Ni, Y.-Q., and Ko, J. M., “Neural Network Novelty Filtering for Anomaly Detection,”2nd International Workshop on Structural Health Monitoring, edited by F. Cheng, Technomic Pub. Co., Standford, USA, 1999, pp. 133–137.
- Cheema et al. [2016a] Cheema, P., Khoa, N. L. D., Makki Alamdari, M., Liu, W., Wang, Y., Chen, F., and Runcie, P., “On Structural Health Monitoring Using Tensor Analysis and Support Vector Machine with Artificial Negative Data,” Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, New York, NY, USA, 2016a, pp. 1813–1822.
- Lindgren  Lindgren, E., “US Air Force Research Laboratory Perspective on Structural Health Monitoring in Support of Risk Management,” PHM Society European Conference, Vol. 4, 2018.
- Airbus  Airbus, “Services By Airbus: Airbus Real Time Health Monitoring,” , 2018. URL https://services.airbus.com/maintenance/e-solutions/airbus-real-time-health-monitoring/airbus-real-time-health-monitoring.
- Boeing Information Services  Boeing Information Services, “Airplane Health Management,” Boeing EDGE, , No. Seattle, WA, United States of America, 2012.
- Rolls Royce  Rolls Royce, “Aftermarket Services: Total Care,” , 2018. URL https://www.rolls-royce.com/products-and-services/civil-aerospace/aftermarket-services.aspx#section-carestore.
- Pratt & Whitney  Pratt & Whitney, “Engine Health Monitoring,” , 2017. URL http://www.pw.utc.com/Engine_Health_Monitoring.
- General Electric  General Electric, “Prognostic Health Management Plus,” , 2017. URL https://www.geaviation.com/bga/services/prognostic-health-management-plus.
- IATA/SOIF  IATA/SOIF, “Future of the Airline Industry 2035,” International Air Transport Associates, , No. Montreal, Canada, 2017.
- Acar and Yener  Acar, E., and Yener, B., “Unsupervised Multiway Data Analysis: A Literature Survey,” IEEE Transactions on Knowledge and Data Engineering, Vol. 21, No. 1, 2009, pp. 6–20.
- Kolda and Sun  Kolda, T. G., and Sun, J., “Scalable Tensor Decompositions for Multi-aspect Data Mining,” ICDM 2008: Proceedings of the 8th IEEE International Conference on Data Mining, 2008, pp. 363–372. doi:10.1109/ICDM.2008.89.
- Prada et al.  Prada, M. A., Toivola, J., Kullaa, J., and Hollmén, J., “Three-way analysis of structural health monitoring data,” Neurocomputing, Vol. 80, No. 0, 2012, pp. 119 – 128. doi:http://dx.doi.org/10.1016/j.neucom.2011.07.030, special Issue on Machine Learning for Signal Processing 2010.
- Khoa et al.  Khoa, N. L. D., Zhang, B., Wang, Y., Liu, W., Chen, F., Mustapha, S., and Runcie, P., PAKDD 2015, Vietnam, May 19-22, 2015, Proceedings, Part I, Springer International Publishing, 2015, Chap. On Damage Identification in Civil Structures Using Tensor Analysis, pp. 459–471.
- Rabanser et al.  Rabanser, S., Shchur, O., and Günnemann, S., “Introduction to Tensor Decompositions and their Applications in Machine Learning,” arXiv preprint arXiv:1711.10781, 2017.
Schölkopf et al. 
Schölkopf, B., Williamson, R. C., Smola, A. J., Shawe-Taylor, J., and Platt, J. C., “Support vector method for novelty detection,”Advances in neural information processing systems, 2000, pp. 582–588.
- Scholkopf and Smola  Scholkopf, B., and Smola, A., “Support Vector Machines and Kernel Algorithms,” , 2002.
- Kolda and Bader  Kolda, T. G., and Bader, B. W., “Tensor Decompositions and Applications,” SIAM Review, Vol. 51, No. 3, 2009, pp. 455–500. doi:10.1137/07070111X.
- Zhou et al.  Zhou, S., Vinh, N. X., Bailey, J., Jia, Y., and Davidson, I., “Accelerating online cp decompositions for higher order tensors,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2016, pp. 1375–1384.
- Baker et al.  Baker, M. L., Goggin, P., and Winther, B. A., “Aeroservoelastic Modeling, Analysis, and Design Techniques for Transport Aircraft,” Defense Technical Information Center Compilation Part Notice, BOEING, Vol. ADPOI0476, 1999, pp. 3–1:3–11.
- Noll  Noll, T., “Aeroservoelasticity,” NASA technical Memorandum 102520, Vol. NASA Langley Research Centre, Hampton, Virginia, 1990, pp. 317–318.
- Heimbaugh  Heimbaugh, R., “Flight Controls Structural Dynamics IRAD,” McDonnell Douglas Report, Vol. MDC-J2303, 1983.
- Pitt and Goodman  Pitt, D., and Goodman, C., “FAMUSS: A New Aeroservoelastic Modelling Tool,” AIAA/ASME/ASCE/AHS/ ASC 33rd Conference on Structures, Vol. AIAA-92-2395, 1992.
- Roger  Roger, K., “Airplane Math Modeling for Active Control Design,” Vol. AGARD-CP-228, 1977.
- Edwards  Edwards, J., “Unsteady Aerodynamic Modeling and Active Control,” SUDAAR 504, Vol. Stanford University, 1977.
- Wright et al.  Wright, J. R., Wong, J., Cooper, J. E., and Dimitriadis, G., “On the use of control surface excitation in flutter testing,” Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, Vol. 217, No. 6, 2003, pp. 317–332.
- Hancock et al.  Hancock, G., Wright, J., and Simpson, A., “On the teaching of the principles of wing flexure-torsion flutter,” The Aeronautical Journal, Vol. 89, No. 888, 1985, pp. 285–305.
- Minguzzi  Minguzzi, E., “Rayleigh’s dissipation function at work,” European Journal of Physics, Vol. 36, No. 3, 2015, p. 035014.
- Wright and Cooper  Wright, J. R., and Cooper, J. E., Introduction to aircraft aeroelasticity and loads, Vol. 20, John Wiley & Sons, 2008.
- Cheema et al. [2016b] Cheema, P., Munk, D. J., Giannelis, N. F., and Vio, G. A., “Experimental validation of polynomial chaos theory on an aircraft t-tail,” 18th AIAA Non-Deterministic Approaches Conference, 2016b, p. 0953.
- Chun  Chun, D. M. K., “Investigation into the cause of pneumatic actuator failure on the HypoSurface,” Ph.D. thesis, Massachusetts Institute of Technology, 2007.
- Silva and Hammond  Silva, J., and Hammond, J. L., “Reliability, Maintainability, and Performance Issues in Hydraulic System Design,” Tech. rep., BOEING VERTOL CO PHILADELPHIA PA, 1977.
- Hofmann et al.  Hofmann, T., Schölkopf, B., and Smola, A. J., “A review of kernel methods in machine learning,” Mac-Planck-Institute Technical Report, Vol. 156, 2006.
- Bro and Kiers  Bro, R., and Kiers, H. A., “A new efficient method for determining the number of components in PARAFAC models,” Journal of Chemometrics: A Journal of the Chemometrics Society, Vol. 17, No. 5, 2003, pp. 274–286.
- Bishop  Bishop, C. M., Pattern Recognition and Machine Learning, Springer, Singapore, 2013.