Multi-Objective Variational Autoencoder: an Application for Smart Infrastructure Maintenance

03/11/2020 ∙ by Ali Anaissi, et al. ∙ The University of Sydney 0

Multi-way data analysis has become an essential tool for capturing underlying structures in higher-order data sets where standard two-way analysis techniques often fail to discover the hidden correlations between variables in multi-way data. We propose a multi-objective variational autoencoder (MVA) method for smart infrastructure damage detection and diagnosis in multi-way sensing data based on the reconstruction probability of autoencoder deep neural network (ADNN). Our method fuses data from multiple sensors in one ADNN at which informative features are being extracted and utilized for damage identification. It generates probabilistic anomaly scores to detect damage, asses its severity and further localize it via a new localization layer introduced in the ADNN. We evaluated our method on multi-way datasets in the area of structural health monitoring for damage diagnosis purposes. The data was collected from our deployed data acquisition system on a cable-stayed bridge in Western Sydney and from a laboratory based building structure obtained from Los Alamos National Laboratory (LANL). Experimental results show that the proposed method can accurately detect structural damage. It was also able to estimate the different levels of damage severity, and capture damage locations in an unsupervised aspect. Compared to the state-of-the-art approaches, our proposed method shows better performance in terms of damage detection and localization.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Related Work

Anomaly detection methods have been employed in many application domains such as damage detection in civil structure (Anaissi et al., 2018b, 2017a, c; Khoa et al., 2018), intrusion detection in network (Leung and Leckie, 2005; Mukkamala et al., 2002) and numerous other fields. They are mainly proposed to handle the cases when only normal/positive data are available. For instance, (Yin et al., 2014)

designed a robust one-class support vector machine (OCSVM) to eliminate the influence of outliers to the learned boundary and used it to detect damage in a simulated structure. Mahadevan and Shin

et al. (Yin et al., 2014) and (Mahadevan and Shah, 2009), proposed an approach for fault detection and diagnosis using OCSVM and SVM-recursive feature elimination. Further, the authors (Yin et al., 2014) and (Mahadevan and Shah, 2009) used OCSVM to detect damage in a rotating machinery and the results showed that the performance of the proposed method is superior to the state-of-the art methods. However, the work above focused on damage detection using two-way matrix data generated via individual sensor which might help in detecting damage but not in assessing its severity or localize it.

In the recent years, various data fusion methods have been used in SHM applications to deal with the multi-way data (Sophian et al., 2003; Lu and Michaels, 2009; Anaissi et al., 2018d). Some of these methods performed data fusion in an unsophisticated manner by simply concatenating features obtained from different sensors (Sophian et al., 2003). However, more advanced methods including principle component analysis (PCA), neural networks and Bayesian methods have been adopted at this level (Jiang et al., 2006). In this context, khoa et al. (Khoa et al., 2017)

used advanced tensor analysis to fuse data from multiple sensors followed by constructing a OCSVM model for damage detection. The authors were able to successfully detect and assess the severity of the damage but failed to localize it.

With the advent of deep learning methods, ADNN attracted many researchers working in the area of anomaly detection due its promising achievements in many domains

(Hong et al., 2015; AP et al., 2014; Germain et al., 2015). Jinwon and Sungzoon (An and Cho, 2015) propose a variational autoencoders for anomaly detection tasks. They used a probability measure to generate the anomaly score instead of reconstruction error. The work in (Chong and Tay, 2017) also uses autoencoders for anomaly detection in videos. The authors evaluate their method on real-world datasets and reported better performance over other state-of-the-art methods. The authors in (Yan and Yu, 2015) use deep learning methods to hierarchically learn features from the sensor measurements of exhaust gas temperatures. Then they use the learned features as input to an ADNN for performing combustor anomaly detection.

In fact, there are still few works in which researchers try to apply ADNN methods to other data analytic tasks such as data fusion in multi-way datasets. In this study, we propose a MVA deep neural network as a data fusion method to extract damage sensitive features from three-way measured responses and to perform damage detection based on the reconstruction probability. Further, the average distance between the anomaly scores of each corresponding sensor nodes are used as an another measure to localize and assess the severity of structural damage.

2. Background

2.1. Autoencoder Deep Neural Network

Autoencoder deep neural network is an unsupervised learning process which has the ability to learn from one class data. It is an extension to the deep neural network which basically designed for supervised learning when the class labels are given with the training examples. The rational idea of an autoencoder is to force the network to learn a lower dimensional space

for the input features , and then try to reconstruct the original feature space to . In other words, it sits the target values to be approximately equal to its original inputs. In this sense, the main objective of autoencoders is to learn reproducing input vectors as outputs . Figure 1 illustrates the architecture of ADNN composed of hidden layers ( for simplification). Layer is the input layer which encoded into the middle layer , and then decoded into the output layer . Each layer consists from a set of nodes denoted by circle in Figure 2. The nodes in the input layer represents the input features which are often aligned with the number of features for a given dataset. However, the number of nodes in the hidden layer(s) are selected by user. In contrast to the traditional neural network, the number of nodes in the output layer are aligned with the same number of the input layers.

The learning process of ADNN successively computes the output of each node in the network. For a node in layer we calculate an output value obtained by computing the total weighted sum of the input values plus the bias term using the following equation:

(1)

The parameter is the coefficient weight written as when associated with the connection between node in layer , and node in layer . The parameter is the bias term associated with the node in layer and is the output value of node in layer

. The resultant output is then processed through an activation function denoted by

, and it is defined as follows:

(2)

Intuitively, in the input layer , and in the output layer, . The most common activation functions in the hidden layers are the sigmoid and hyperbolic tangent defined in Equations 3 and 4, respectively. However, in autoencoder settings a linear function is used in the output layer since we don’t scale the output of the network to a specific interval or .

(3)
(4)

Lets say that an autoencoder is composed of two systems known as encoder and decoder . The encoder maps an input vector to a latent vector . Then the decoder maps back to the original input feature . The autoencoder uses back propagation algorithm to learn the parameters . In each iteration of the training process, we perform a feedforward pass which successively computes the output values for all layer’s nodes. Once completed, we calculate the cost error using Equation 5 and then propagate it backward to the network layer.

(5)

In this setting, we perform a stochastic gradient descent step to update the learning parameters . This is done by computing the partial derivative of the cost function (defined in Equation 5) with respect to and as follows:

(6)

We update in the same way. The complete steps are summarized in Algorithm 1.

Input: A set of positive samples
  1. Initialize to a small random value

repeat For
  1. Perform a feedforward pass to compute all nodes activations using Equations 1 and 2.

  2. Compute the sum of reconstruction cost error using Equation 5

  3. Update using Equation 6

until convergence of parameters Output: encoder , decoder .
ALGORITHM 1 Autoencoder training algorithm
Figure 2. Autoencoder neural network architecture.

Once the autoencoder get trained, the network will be able to reconstruct an new incoming positive data, while it fails with anomalous data. This will be judged based on the reconstruction error (RE) which is measured by applying the Euclidean norm to the difference between the input and output nodes as shown in Equation 7.

(7)

The measured value of RE is used as anomaly score for a given new sample. Intuitively, examples from the similar distribution to the training data should have low reconstruction error, whereas anomalies should have high anomaly score. Algorithm 2 shows the process of anomaly detection based on the reconstruction error of autoencoders.

Input: A set of new arrived samples , and
  1. Algorithm 1

For
  1. Perform a feedforward pass to compute all nodes’ activations using Equations 1 and 2.

  2. Compute using Equation 7

  3. if

    1. is an anomaly

  4. else

    1. is normal

ALGORITHM 2 Autoencoder anomaly detection algorithm

3. Multi-Objective Variational Autoencoder

Figure 3. Autoencoder deep neural network architecture of MVA.

We propose a multi-objective variational autoencoder (MVA) neural network for damage detection and diagnosis based on the reconstruction probability of ADNN. Our MVA method performs multi-way data fusion by taking a frontal slice from the training data (as shown in Figure 3). Each input slice represents all feature signals across all locations at a particular time. Stochastic gradient descent algorithm is used here to learn reconstructions that are close to its original input slice. Once the network get trained, we create a sensor identity matrix in which each row captures meaningful information for each sensor location for damage localization purposes. The values in this matrix are obtained by calculating the average total reconstruction error for each set of output nodes related to one single sensor.

Our method employed the concept of variational auto encoder (VAE) for computing the anomaly score for each new incoming data slice. It aims to calculate the anomaly score for new arrived data based on its reconstruction probability. This measure provides more principled and objective decision value than reconstruction errors since it considers the variability of the distribution variables, and does not require presetting fixed threshold parameter for identifying damage. Setting a threshold for reconstruction error is problematic especially in the case of muti-way heterogeneous data. Moreover, the normal and anomaly data might shares the same mean value. However, anomalous data will not share the same variance to the normal data and it leads to significant lower reconstruction probability, thus classified as damage. The following sections discuss the details of the proposed method.

3.1. Multi-Way Data Fusion

As we observed in this study, a large number of sensors are usually used to collect data in SHM applications which often aim to monitor large civil structures such as bridge or a high-rise building. The sensing data being generated from networked sensors mounted structures are considered as three-way data in the form of () as previously described in Figure 1. In this setting, two-way matrix analysis is not able to capture the correlation between sensors (Acar and Yener, 2009). At the same time, unfolding the three-way data and concatenating the frequency features from multiple sensors at a certain time to form a single data instance at that time may result in information loss since it breaks the modular structure inherent in three-way data  (Acar and Yener, 2009). Accordingly, data fusion plays a critical role in analyzing structure behaviours and assessing the severity of any damage data.

Basically, ADNN is mainly used for the purpose of dimensionality reduction or as anomaly detection models. In fact, ADNN can be also utilized as data fusion structure which can constructs an internal representation for input data collected from multiple sources i.e. sensors. Therefore, our MVA method utilizes the ADNN as a multi-way data fusion model which automatically learns features via its deep-layered structure.

As shown in Figure 3, ADNN model receives data from multiple sensors at the same time by taking a frontal slice from a training three-way data. Each input slice represents all feature signals across all locations at a particular time. This data from multiple sensors is fed into the input layer to extract damage sensitive features via the encoder layers. The resultant new features in the middle layer () are then used by the decoder layers to determine the damage detection results.

3.2. Probabilistic Anomaly Detection

The rational idea of anomaly detection in ADNN is to see how well a new data point follows the normal examples. We mentioned before that ADNN aims to learn (encoder) a lower dimensional space for input features , and then try to reconstruct (decode) the original feature space . Let’s denote the encoder and decoder by and , respectively. This representation is known as the conditional probability. For example, is the conditional of such that has happened. Intuitively, the decoder process yields to information loss because the data goes from a low dimensional space to a larger dimensional space . This loss is known as the reconstruction error which can be measured by calculating the log-likelihood and it will be eventually used as an anomaly score. This measure allows us to see how effectively the decoder has learned to reconstruct an input features given its latent representation .

Our probabilistic anomaly detection method follows the concept of VAE to find a distribution of some latent variable which we can sample from to generate new samples from . Each latent variable

represents a probability distribution for a given input feature. In the decoding process, we randomly sample from this latent state distribution to generate a vector to be used as an input for the decoder model.

Given be a set of observed variables and is the set of latent variables, the objective function of VAE is considered as an inference problem which aims to compute the conditional distribution of latent variables given the observations i.e. . Using Bayesian theorem, we can write it as follows:

(8)

However, calculating the evidence is not practical since it requires computing a multidimensional integral in the unknown variables (Kingma and Welling, 2013). Thus, the variational inference (VI) tool is used here to perform approximate Bayesian of the posterior distribution with a parametric family of distributions in a such way that it has tractable solution. The main idea of VI is to pose the inference problem as an optimization problem by modeling using where has a simple distribution such as Gaussian.

The divergence method defined in Equation 9 is used here to measure the information loss between the two probability distributions and . In this sense, the optimization problem is to minimize the divergence denoted by i.e. ().

(9)

By substituting Equations 8 in 9, the resultant equation will be as follows:

(10)

where . Since the the expectation (E) is based on and does not involve , we can remove from Equation 10 and write it as follows:

(11)

The final objective function of variational autoencoder is as follows:

(12)

The first term i.e. represents the reconstruction likelihood and the second term i.e is the regularization parameter which forces the posterior distribution to be similar to the prior distribution

. The loss function

of our autoencoder is the negative value of the objective function and its defined as:

(13)

In variational Bayesian method, this loss function is known as the variational lower bound or evidence lower bound (ELBO). This ”lower bound” part comes from the fact that divergence is always non-negative. Thus is the lower bound of , and it is also known that . As a result . Therefore by minimizing the loss, we are maximizing the lower bound of the probability generating real data samples.

Now we need to train the variational autoencoder to learn using gradient descent algorithm to optimize the loss with respect to the parameters . This is where the VAE can relate to the autoencoder where the encoder model learns by mapping to and the decoder model learns by mapping back to . For stochastic gradient descent with step size , the encoder parameters are updated using Equation 6. Once is learned, we sample the latent vector from and then feed it into the decoder network to generate the new data . The training steps of MVA are illustrated in Algorithm 3.

Input: A set of positive samples
  1. Initialize to a small random value

repeat
  1. for

      1. Generate samples from

      2. Perform a feedforward pass to compute all nodes’ activations.

  1. end for

  • Compute the error usig Equation 13

  • Update using gradients of E

    until convergence of parameters

  • ALGORITHM 3 MVA training algorithm

    To get the reconstruction , we generate random samples from where and

    are the mean and standard deviation of the middle layer

    in ADNN, respectively. For each random sample in , we calculate and for the output layer in ADNN. The final reconstruction probability (RB) can be estimated as follows:

    (14)

    The damage detection steps of MVA are illustrated in Algorithm 4.

    Input: A set of new arrived samples
    1. Algorithm 3

    For
    1. Generate samples from

    2. For

    3. end for

  • Compute using Equation 14

  • if reconstruction probability(i) then

    1. is healthy

  • else

    1. is damage

    end for

  • ALGORITHM 4 MVA damage detection algorithm

    3.3. Damage Localization

    Once a new data slice identified as anomaly by ADNN, the values from the output nodes are further propagated into another layer called localization layer as illustrated in Figure 3. It consists from a set of nodes each representing one sensor data source. The purpose of this layer is to solve the problem of fault localization. The output values to this layer are obtained by calculating the average of the total reconstruction error for each output nodes related to one sensor. The resultant outputs are stored in a matrix where is the number of sensors and is the number of features for each sensor. Using matrix, it is possible to perform a -nearest neighbouring algorithm on new output scores with each row of matrix to locate the anomalous rows. The average distance difference between and is used as another anomaly score for damage localization.

    4. Experimental Results

    4.1. Data Collection

    We conducted experiments on two case studies representing typical types of civil structures. The first case study is a real data collected from a cable-stayed bridge in Western Sydney, Australia. The second one is a laboratory based building structure obtained from Los Alamos National Laboratory (LANL) (Larson and Von Dreele, 1987).

    4.1.1. The Cable-Stayed Bridge

    The bridge was instrumented by 24 uniaxial accelerometers and 28 strain gauges. In this paper we are using only features based on accelerations data collected from sensors with . Figure 4 shows the locations of these 24 sensors on the bridge deck.

    For the sake of experiments, we emulated two different kind of damage on this bridge by placing a large static load (vehicle) at different location of a structure. Thus, three scenarios have been considered which includes: no vehicle is placed on the bridge (healthy state), a light vehicle with approximate mass of 3 t is placed on the bridge close to location A10 (”Car-Damage”) and a bus with approximate mass of 12.5 t is located on the bridge at location A14 (”Bus-Damage”). This emulates slight and severe damage cases which were used in our evaluation Section 4.2.1.

    Figure 4. The locations on the bridge’s deck of the 24 accelerometers used in this study. The cross girder of the bridge is displayed as .

    4.1.2. Building Data

    Our second case study was based on the a data collected by (Larson and Von Dreele, 1987) from three-story building structure. It is made up of Unistrut columns and aluminum floor plates connected by bolts and brackets as presented in Figure 5. Eight accelerometers were instrumented on each floor (two on each joint). A shaker was placed at corner D to generate excitation data. It generates 240 samples (a.k.a. events) separated into two main groups, Healthy (150 samples) and Damaged (90 samples). Each event consists of acceleration data for a period of 5.12 seconds sampled at 1600 Hz, resulting in a vector of 8192 frequency values. The Damaged samples were further partitioned into two different damaged cases based on their location: damage in location 3C (60 samples), and the damage in both locations 1A and 3C (30 samples). The damage was introduced by detaching or loosening the bolts at the joints, allowing the aluminum floor plate to move freely relative to the Unistrut column.

    Figure 5. Three-story building and floor layout (Larson and Von Dreele, 1987).

    4.2. Results and Discussions

    This section demonstrates how our MVA method can successfully detect and assess the severity of structural damage, and further localize it. It is using the sensor-based data from the two case studies described in Section 4.1.1.

    For all experiments, six hidden layers were used in MVA and the accuracy values were obtained using the F-Score (FS) measure defined as

    where and (the number of true positive, false positive and false negative are abbreviated by TP, FP and FN, respectively). The OCSVM model was used in these experiments as a state-of-the-art method for comparison purposes. The rate of anomalies in OCSVM was set to 0.05 and the Gaussian kernel parameter was tuned using a technique proposed by (Anaissi et al., 2018a).

    4.2.1. The Cable-Stayed Bridge

    Our MVA method was initially validated using vibration data collected from the cable-stayed bridge described in Section 4.1.1. We used 24 uni-axial accelerometers to generate 262 samples (a.k.a events) each consists of acceleration data for a period of 2 seconds at a sampling rate of 600 Hz.

    For each reading of the uni-axial accelerometer, we normalized its magnitude to have a zero mean and one standard variation. The fast Fourier transform (FFT) is then used to represent the generated data in the frequency domain. Each event now has a feature vector of 600 attributes representing its frequencies. The resultant three-way data has a structure of 24 sensors

    600 features 262 events. We separated the 262 data instances into two groups, 125 samples related to the healthy state and 137 samples for damage state. The 137 damage examples were further divided into two different damaged cases: the ”Car-Damage” samples (107) generated when a stationary car was placed on the bridge, and the ”Bus-Damage” samples (30) emulated by the stationary bus.

    We randomly selected eighty percent of the healthy events (100 samples) from each sensor to form training multi-way of (i.e. training set). The 137 examples related to the two damage cases were added to the remaining 20% of the healthy data to form a testing set, which was later used for the model evaluation. Our probabilistic anomaly detection algorithm was able to successfully detect all the healthy and damage events in the testing data set, and achieved an F-Score of 100%. Moreover, this model was able to assess the progress of damage severity in the structure based on the obtained reconstruction probabilities. To illustrate that, we plotted the reconstruction probability values for all test samples which were shown in Figure 6. The horizontal axis indicates the index of the test samples and the vertical axis indicates the magnitude of the reconstruction probability values. A value above the horizontal dashed line (as shown in Figure 6) indicates a sample classified as healthy, whereas a value below that line indicates an event classified as damage. This line represents an anomaly threshold value which was used to identify whether a new event is belong to the healthy or damage state.

    The first 25 healthy events denoted by green dot were all correctly classified as healthy samples with a reconstruction probability below the anomaly threshold value of 3% (97o% of confidence interval). All the damage samples denoted by yellow and orange dot refer to the ”Car-Damage” and ”Bus-Damage”, respectively, generate high reconstruction probability values above the anomaly threshold, thus identified as damage. We further calculated the mean of all the reconstruction probability values for each state to illustrate how the MVA model was also able to asses the severity of the identified damage. Figure

    6 shows a solid black line which was drawn to connect the mean values. It can be clearly observed that the MVA model was able to separate the two damage cases (”Car-Damage” and ”Bus-Damage”) where the reconstruction probability values were further increased for the samples related to the more severe damage cases related to ”Bus-Damage”.

    The last step in MVA model was to localize the position of the detected damage by analyzing the identity matrix where each row captures meaningful information for each sensor location. We calculated the average distance from each row in matrix to -nearest neighbouring to . The resultant -nn score for each sensor is presented in Figure 7 which clearly shows the capability of MVA for damage localization. As expected, sensors A10 and A14 related to the ”Car-Damage” and ”Bus-Damage”, respectively, behaved significantly different from all the other sensors apart from the position of the emulated damage.

    Figure 6. Damage estimation using reconstruction probability values obtained by MAE applied on the cable-stayed bridge dataset.
    Figure 7. Location anomaly score in the localization layer applied on the cable-stayed bridge dataset.

    The next experiment was to compare our obtained results with the state-of-the-art method OCSVM. The same training data set as above was used to construct a OCSVM model, and the same testing data set was used to evaluate the classification performance of OCSVM. The F-score accuracies of OCSVM was recorded at 95%. However, the OCSVM decision values were not able to clearly assess the progress of the damage severity in the structure as illustrated in Figure 8. Moreover, OCSVM is lacking the capability to implement a method for damage localization since only one single anomaly score for each event is generated by OCSVM model using the inputs from sensors .

    Figure 8. Damage identification results using OCSVM on the cable-stayed bridge dataset.

    4.2.2. Building Data

    Our second experiments were conducted using the acceleration data acquired from 24 sensors instrumented on the three-story building as described in Section 4.1.2. Similar to the previous experiments, we normalized the accelerometer data to have zero mean and unity variance. Then we applied FFT method to represent the data in frequency domain. For each two adjacent accelerometers at a location, we used the difference between their signals as variables and only the top 150Hz were selected as input features to our MVA model. The resultant three-way data has a structure of 12 locations 768 features 240 events.

    We randomly selected 80% of the healthy events (120 samples) from the 12 locations as a training multi-way data (i.e.training set). The remaining 20% of the healthy data and the data obtained from the two damage cases were used for testing (i.e.testing set).

    Figure 9. Damage estimation using reconstruction probability values obtained by MAE applied on the Building dataset.

    Our constructed MVA model achieved an F-score of 97%. The false alarm rate was equal to zero where all the healthy samples are correctly detected in the testing data set. Figure 9 shows the plot of the reconstruction probability values generated by MVA. It can be clearly observed from Figure 9 that the more severe damage test data related to locations 1A and 3C were more deviated from the training data with high reconstruction probability values.

    Similar to the last case study, we further propagated the reconstruction probability values obtained by the output layer into the localization layer to construct matrix. Then we computed the -nn score for each sensor based on the average distance between each row of matrix to . Figure 10 shows the resultant -nn score for each sensor. It clearly shows that MVA method correctly captures damage locations. As expected, sensors 1A and 3C produced very high -nn score due the introduced damage at these two locations. The -nn score of 3C was higher than 1A because that damage was introduced in both locations 1A and 3C at the same time.

    Figure 10. Location anomaly score in the localization layer on the Building dataset
    Figure 11. Damage estimation using decision values generated by OCSVM applied on the Building dataset.

    The last experiment was to compare our obtained results with OCSVM. The F-score accuracy of OCSVM was recorded at 86% with no clear separation between the different levels of damage as illustrated in Figure 11. Again, OCSVM doesn’t have the capability to implement a method for damage localization since only one single anomaly score for each event is generated by OCSVM model using input data from sensors .

    5. Conclusion

    Multiway data analysis has gained a lot of interest in many fields where standard two way analysis don’t have the capabilities to learn underlying structure of the multi-way data. We proposed a multi-objective variational autoencoder method for damage detection, localization and severity assessment in multi-way structural data based on the reconstruction probability of the autoencoder deep neural network. The proposed method performs data fusion by taking input features from a networked sensors attached to a structure. Stochastic gradient descent algorithm is then used to learn reconstructions that are close to its original input slice followed by constructing a sensor identity matrix which used for damage localization. For each new incoming data slice we calculate its anomaly score based on reconstruction probability and we use the obtained reconstruction probability values for damage assessment. The sensor identity matrix is finally utilized to locate the identified damage.

    We evaluated our method on multi-way datasets in the area of structural health monitoring for damage detection purposes. The data was collected from our deployed data acquisition system on a cable-stayed bridge in Western Sydney and from a laboratory based building structure obtained from Los Alamos National Laboratory (LANL). Experimental results showed that our approach succeeded at detecting the damage events with an average F-score of 0.95% and higher for all datasets. Moreover, Our model demonstrated the capability to work very well in localizing damage and estimating different levels of damage severity in an unsupervised aspect. Compared to the state-of-the-art approaches, our proposed method shows better performance in terms of damage detection and localization.

    Acknowledgements.
    The authors also would like to thank the Western Sydney University and University of New South Wales for facilitating the field tests and data collection process.

    References

    • E. Acar and B. Yener (2009) Unsupervised multiway data analysis: a literature survey. IEEE Transactions on Knowledge and Data Engineering 21 (1), pp. 6–20. Cited by: §3.1.
    • J. An and S. Cho (2015) Variational autoencoder based anomaly detection using reconstruction probability. Special Lecture on IE 2, pp. 1–18. Cited by: §1.
    • A. Anaissi, A. Braytee, and M. Naji (2018a) Gaussian kernel parameter optimization in one-class support vector machines. In 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. Cited by: §4.2.
    • A. Anaissi, N. L. D. Khoa, M. M. Alamdari, A. Braytee, Y. Wang, S. Mustapha, and F. Chen (2017a) Adaptive One-Class Support Vector Machine for Damage Detection in Structural Health Monitoring. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 459–471. Cited by: §1.
    • A. Anaissi, N. L. D. Khoa, T. Rakotoarivelo, M. M. Alamdari, and Y. Wang (2017b) Self-advised incremental one-class support vector machines: an application in structural health monitoring. In International Conference on Neural Information Processing, pp. 484–496. Cited by: §0.1.
    • A. Anaissi, N. L. D. Khoa, T. Rakotoarivelo, M. M. Alamdari, and Y. Wang (2018b) Adaptive online one-class support vector machines with applications in structural health monitoring. ACM Transactions on Intelligent Systems and Technology (TIST) 9 (6), pp. 64. Cited by: §1.
    • A. Anaissi, N. L. D. Khoa, and Y. Wang (2018c) Automated parameter tuning in one-class support vector machine: an application for damage detection. International Journal of Data Science and Analytics, pp. 1–15. Cited by: §1.
    • A. Anaissi, M. Makki Alamdari, T. Rakotoarivelo, and N. L. D. Khoa (2018d) A tensor-based structural damage identification and severity assessment. Sensors 18 (1), pp. 111. Cited by: §1.
    • A. Anaissi and S. M. Zandavi (2019) Multi-objective autoencoder for fault detection and diagnosis in higher-order data. In 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. Cited by: §0.1, §0.1.
    • S. C. AP, S. Lauly, H. Larochelle, M. Khapra, B. Ravindran, V. C. Raykar, and A. Saha (2014) An autoencoder approach to learning bilingual word representations. In Advances in Neural Information Processing Systems, pp. 1853–1861. Cited by: §1.
    • Y. S. Chong and Y. H. Tay (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In International Symposium on Neural Networks, pp. 189–196. Cited by: §1.
    • C. R. Farrar and K. Worden (2006) An introduction to structural health monitoring. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 365 (1851), pp. 303–315. Cited by: §0.1.
    • M. Germain, K. Gregor, I. Murray, and H. Larochelle (2015) Made: masked autoencoder for distribution estimation. In International Conference on Machine Learning, pp. 881–889. Cited by: §1.
    • G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal processing magazine 29 (6), pp. 82–97. Cited by: §0.1.
    • C. Hong, J. Yu, J. Wan, D. Tao, and M. Wang (2015) Multimodal deep autoencoder for human pose recovery. IEEE Transactions on Image Processing 24 (12), pp. 5659–5670. Cited by: §1.
    • S. Jiang, C. Zhang, and C. Koh (2006) Structural damage detection by integrating data fusion and probabilistic neural network. Advances in Structural Engineering 9 (4), pp. 445–458. Cited by: §1.
    • N. L. D. Khoa, M. M. Alamdari, T. Rakotoarivelo, A. Anaissi, and Y. Wang (2018) Structural health monitoring using machine learning techniques and domain knowledge based features. In Human and Machine Learning, pp. 409–435. Cited by: §1.
    • N. L. D. Khoa, A. Anaissi, and Y. Wang (2017) Smart infrastructure maintenance using incremental tensor analysis. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 959–967. Cited by: §1.
    • D. P. Kingma and M. Welling (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114. Cited by: §3.2.
    • A. Krizhevsky and G. E. Hinton (2011)

      Using very deep autoencoders for content-based image retrieval.

      .
      In ESANN, Cited by: §0.1.
    • A. C. Larson and R. Von Dreele (1987) Los alamos national laboratory report no. Technical report LA-UR-86-748. Cited by: Figure 5, §4.1.2, §4.1.
    • K. Leung and C. Leckie (2005) Unsupervised anomaly detection in network intrusion detection using clusters. In Proceedings of the Twenty-eighth Australasian conference on Computer Science-Volume 38, pp. 333–342. Cited by: §1.
    • Y. Lu and J. E. Michaels (2009) Feature extraction and sensor fusion for ultrasonic structural health monitoring under changing environmental conditions. IEEE Sensors Journal 9 (11), pp. 1462–1471. Cited by: §1.
    • S. Mahadevan and S. L. Shah (2009) Fault detection and diagnosis in process data using one-class support vector machines. Journal of Process Control 19 (10), pp. 1627–1639. Cited by: §1.
    • S. Mukkamala, G. Janoski, and A. Sung (2002) Intrusion detection using neural networks and support vector machines. In Neural Networks, 2002. IJCNN’02. Proceedings of the 2002 International Joint Conference on, Vol. 2, pp. 1702–1707. Cited by: §1.
    • A. Rytter (1993) Vibrational based inspection of civil engineering structures. Ph.D. Thesis, Dept. of Building Technology and Structural Engineering, Aalborg University. Cited by: §0.1.
    • A. Sophian, G. Y. Tian, D. Taylor, and J. Rudlin (2003)

      A feature extraction technique based on principal component analysis for pulsed eddy current ndt

      .
      NDT & e International 36 (1), pp. 37–41. Cited by: §1.
    • I. Sutskever, O. Vinyals, and Q. V. Le (2014) Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112. Cited by: §0.1.
    • W. Yan and L. Yu (2015) On accurate and reliable anomaly detection for gas turbine combustors: a deep learning approach. In Proceedings of the annual conference of the prognostics and health management society, Cited by: §1.
    • S. Yin, X. Zhu, and C. Jing (2014) Fault detection based on a robust one class support vector machine. Neurocomputing 145, pp. 263–268. Cited by: §1.