There are increasing needs to monitor underwater facilities and structures, such as communication lines, wind farms and oil rigs. Autonomous Underwater Vehicles (AUVs) can perform this work with less risk to human lives, leading to an increased interest on sub-sea resident technology. These systems are meant to operate in a harsh environment with little to none access by operators, thus being required to perform several tasks of self-inspection autonomously. This work is part of the development and evaluation of methods to enable autonomous Fault Detection and Diagnosis (FDD) to enhance the self-awareness of sub-sea resident AUVs.
Faults can be divided into two major categories: Hard-faults and Soft-faults. Hard-faults happen abruptly, such as a propeller blade breaking, or rotor blocking. In contrast, Soft-faults are characterized by a continuous deviation of a property from normal conditions, and usually builds up over time, leading to a performance degradation of the affected component. Such deviations are hard to detect at early stages, and may be not critical for the normal operation of the system, but preemptive monitoring of its development is needed. Propeller biofouling and corrosion due to sea water are examples of Soft-faults inducing processes.
Several methods have been proposed for the soft-fault detection and compensation, and most of the approaches depend on explicit physical models of the plant and the faults. Observers, Extended Kalman Filters (EKF) and parameter estimation are examples of such methods. One of the main advantages of data-driven approaches is the ability of learning specific features of the signals generated by the plant, requiring less knowledge of the exact nature or extension of the fault. In the specific case of underwater thrusters, their non-linear dynamic nature might require complex models that are hard to identify with classical methods. Estimation of component condition to perform diagnosis of thruster soft-faults and noncritical failures is challenging, due to the amount and nature of external and internal factors that affect its performance. However, an accurate diagnosis is required to enable better response to the occurrence of these faults on a real scenario.
Neural Networks are a well-proven method for nonlinear dynamic system identification, being a computationally efficient method for nonlinear regression. Several configurations were evaluated for this task (Narendra and Parthasarathy, 1990; Chen et al., 1990) and have been used in many industrial applications, including control and fault diagnosis. Samy et al. (2011) presents a Neural Network based fault detection and isolation diagnosis system for an aerial vehicle. More recently, the works of Ogunmolu et al. (2016) and Wang (2017)
In a previous work (Nascimento et al., 2017), we evaluated several regressors for online modeling of thrusters under changing conditions. This approach uses an initial model as reference to the nominal operation of the component and an adaptive model able to learn non-critical performance deviations. The rotational speed and current signals are used for fault detection, in a similar approach as presented on the work of Caccia et al. (2001)
. In this paper, we extend this evaluation to the diagnosis of non-critical soft-faults using a fault classifier.
2 RNN-based Modeling
In this section, we shortly review some of Recurrent Neural Networks (RNNs) topologies that are used in this work. The Nonlinear Autoregressive Exogenous Model (NARX) networks are a classic scheme to identify dynamic systems. The Long Short-term Memory (LSTM) cell was introduced to solve the problem of gradient vanishing in normal RNNs and the Gated Recurrent Unit (GRU) was introduced recently as a smaller and simpler recurrent cell.
2.1 Nonlinear Autoregressive Exogenous Model
A Nonlinear Autoregressive Exogenous Model derives from the expansion of the past values of the inputs and outputs (targets) of an estimator to be used use as extra features. This expansion creates a time dependency and enables the learning of certain dynamic behavior.
A nonlinear autoregressive exogenous model is derived from the use of past values of input and output. Given the input vectorand the output vector , a NARX model output is defined by:
Applying such a scheme to NNs, it is possible to performing estimation of nonlinear relations based on current and previous values. NARX networks can learn dynamic behavior and preserve long-term dependencies longer than a standard RNN.
2.2 Deep Learning Recurrent Networks
Long Short-term Memory networks is a type of recurrent neural networks proposed in Hochreiter and Schmidhuber (1997)
. With the introduction of a memory cell and the input, output and forget gates, such networks do not present the vanishing gradient problem and are able to preserve information for longer periods.
In the above described cell, is the input vector, is the forget gate activation vector, represents the input gate activation vector, is the output gate activation vector and is output vector. , and
represent respectively weight matrices and bias vector.
3 Data-driven Thruster Modeling
3.1 Data collection
Data was collected with several Bluerobotics T100 thrusters. This model is composed by an electric brushless motor, ranging from 300 to 4200 rpm, has up to 130 W of output power and has 2.36 kgf of nominal torque (Figure (a)a). The thrusters were driven by a Graupner T35 electronic speed controller (ESC), which provides the signals of rotational speed, current, voltage and temperature and is rated to 35 A (Figure (b)b). An ATMEGA328 microcontroller was used as interface between the supervisory software running on a computer and the ESC.
A sequence of experiments were performed to model the thruster behavior in several conditions and evaluate a classification-based diagnosis approach. The dataset was obtained with a testbed using Bluerobotics T100 thrusters and controllers capable of giving rotational speed, voltage and current consumption. Although the relation of thrust and rotational speed is given by the manufacturer, we did not consider this indirect thrust measurement as an output variable of the model.
The responses of the controller-thruster assembly to a step and sinusoidal signal in open loop were used to collect data for the model training. These signals represent a pulse width modulation signal with width varying from 1.0 to 2.0 milliseconds, mapped to the -1.0 to 1.0 range to represent direction. For the step response characterization, every step of the applied input had 0.25 of the maximum possible amplitude, as shown in Figure 4. For the general model, data was obtained using sinusoidal input signals with of 0.01, 0.02, 0.03 and 0.04 Hz. In the Figures 5, 6 and 7, the characteristic thruster open loop response is shown, with a saturated curve for the rotational speed and a quadratic curve for the current.
The step response in open-loop for identification, depicted in Fig. 4, showed a second-order overdamped system with some dead-time. The averaged measurements for settling time were 2.95 seconds and 0.59 seconds for dead-time. Also, according to the manufacturer, the thruster presents a dead band of 25 s, which was considered for further modeling. Due to the effect of the dead time and deadband, the system presents an observable hysteresis, that may impact on the selection of the frequency of signal to perform modeling of the thruster, as shown in Fig. 7.
3.2 Nominal Model Estimation
Four NN-based methods were evaluated for the role of identifying the nominal model: Multilayer Perceptron (MLP) as a baseline regressor, a NARX Neural Networks, LSTM-based network and GRU-based network. The inputs to each model are control signal and voltage, while the outputs are current and rotational speed.
Since purely static regressors are not able to model the characteristic dead-time of the thruster, for the MLP, the known dead-time delay and the dead band non-linearities were added to the estimator in a Hammerstein-Wiener model fashion. For the NARX network, only the deadband was added and their performances evaluated.
Scaling the data vectors was required for training the networks, since this strongly affects the performance of the regressors. Inputs and targets were scaled by the Interquartile Range (IQR), wich allows some outlier robustness. Grid search was used to tune the parameters of each regressor, as shown on Table1
. Besides the number of neurons on every configuration, the batch size and the numbers of past time steps to be taken into training (Lookback) were adjusted. For the NARX networks, every input and target feature was considered to have independent tapped delay line operators. The training set consisted of 2000 samples of the sinusoidal signal with 0.01 Hz. The regressors were trained with a 2-fold time series cross validation scheme, in which the validation set is always taken from a section posterior than the training set. The regressors were compared with a 1000 sample dataset in each frequency analyzed (0.01, 0.02, 0.03 and 0.04 Hz).
|Regressors - Configurations|
|Method||Hidden Layers||Batch Size||Lookback|
|MLP + TDL + DB||8P,4P||10||-|
|NARX NN + DB||32P,4P||5||(20,0),(2,0)|
The mean coefficients of determination ( scores) between current and rotational speed outputs, for every frequency, are shown in Table 2. The scatter plot for a 1000 sample test set of a sinusoid of 0.01 Hz is shown in the Figure 8. The results over time for 300 sample test set for the different frequencies analysed are shown in the Figure 9.
|Regression Scores - Frequency|
|0.01 Hz||0.02 Hz||0.03 Hz||0.04 Hz|
The results of Table 2 and Figures 8 and 9 show that the hysteresis caused by time delay were better captured by the NARX network. This network sctructure is also able to rubustly predict the system response to other input signal frequencies. This regressor was then used as a reference nominal model for the fault classification task (see Section 4).
4 Soft-Fault Classification
For the task of fault diagnosis, data was collected and labeled with a signal of 0.01 Hz for the following operational conditions: nominal operation with 15.0 V, 13.0 V, 11.8 V, with one and two broken propeller blades and with a propeller impregnated with silicon to simulate biofouling (see Figure 10). A distinction between model changes due to healthy low-voltage states and faulty states may lead to a better fault handling decision making. These conditions are represented by six classes in our fault classifier. The scatter plots for each condition are depicted in Figure 11. The residuals calculated from comparison to the nominal model obtained previously (NARX) were used for the fault classification task as features and are shown in the Figure 12.
Several classifiers were trained with each condition as a class, in two configurations. The first configuration uses the thruster input, voltage, rotational speed and current signals as features, with 2000 samples for each class. The second configuration uses the rotational speed and current residuals as features using 2000 residual samples each. Two methods were evaluated for diagnosis: multilayer perceptron and LSTM-based classifers in the configurations listed on the Table 3
. A 3-fold time series split cross-validation scheme and cross entropy loss function were used for training of the classifiers. A test set of 11200 labeled samples was used to evaluate the overall performance of the classifier.
Table 3 shows that the use of computed residuals (rotational speed and current) as features for classification lead to improved results compared to a full four-feature vector. This can be explained by additional information introduced by the nominal model, including a small measure of difference between what is expected and what is actually being measured, leading to a better classification performance.
|Classifiers Configurations - Diagnosis|
|Method||Batch Size||Hidden Layers||Avg. Accuracy|
The confusion matrix for the best performing classifier (MLP with residuals) is shown in Table4. Our results show that identifying nominal operation is confused with a broken propeller, but identifying a broken propeller by itself works quite well. Biofouling is also hard to identify, with typical confusion with nominal operation. While we only achieve 78 % accuracy on our dataset, these results also show that the problem of classifying soft-faults is hard, and there is lots of room for improvement.
|Confusion Matrix - MLP with Residuals|
|Nominal||23 %||0 %||0 %||77 %||0 %||0 %|
|13.0 V||0 %||100 %||0 %||0 %||0 %||0 %|
|11.8 V||0 %||0 %||100 %||0 %||0 %||0 %|
|Br. Prop.||6 %||0 %||0 %||94 %||0 %||0 %|
|2 Br. Prop.||4 %||0 %||0 %||82 %||14 %||0 %|
|Biofouling||31 %||0 %||0 %||5 %||0 %||64 %|
5 Conclusions and Future Work
In this paper, we evaluate RNNs compared to the classic MLP for modeling and soft-fault classification of underwater thrusters using empirical data. As estimator, NARX network demonstrated to be able to identify more accurately the time dependencies presented by the thrusters. For soft-fault classification, we also evaluated LSTMs and MLPs. Noncritical thruster failure conditions produce signal features capable of being diagnosed by a classifier-based approach, if the faulty behavior is previously known. In our experiments, a maximum of 78% average accuracy score was obtained using the residuals computed from a nominal model. This result shows that modeling soft-faults is hard, as there are large confusions with other classes, indicating the need for further work.
Future work will include evaluation of this approach with data from real missions, and we will consider unbalanced datasets that are closer to real operations. We also expect that time-series models based on other kinds of features can improve our classification results.
- Caccia et al. (2001) Caccia, M., Bono, R., Bruzzone, G., Bruzzone, G., Spirandelli, E., and Veruggio, G. (2001). Experiences on Actuator Fault Detection, Diagnosis and Accommodation for ROVs. In 12th International Symposium on Unmanned Untethered Submersible Technology.
- Chen et al. (1990) Chen, S., Billings, S.A., and Grant, P.M. (1990). Non-linear system identification using neural networks. International Journal of Control, 51(6), 1191–1214. doi:10.1080/00207179008934126.
Cho et al. (2014)
Cho, K., van Merrienboer, B., Bahdanau, D., and Bengio, Y. (2014).
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.doi:10.3115/v1/W14-4012. URL http://arxiv.org/abs/1409.1259.
- Goodfellow et al. (2017) Goodfellow, I.J., Bengio, Y., and Courville, A. (2017). Deep Learning. doi:10.1038/nmeth.3707.
- Graves (2012) Graves, A. (2012). Supervised Sequence Labelling with Recurrent Neural Networks. 385. doi:10.1007/978-3-642-24797-2.
- Hochreiter and Schmidhuber (1997) Hochreiter, S. and Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. doi:10.1162/neco.19188.8.131.525.
- Narendra and Parthasarathy (1990) Narendra, K. and Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1), 4–27. doi:10.1109/72.80202. URL http://ieeexplore.ieee.org/document/80202/.
- Nascimento et al. (2017) Nascimento, S., Kim, S.K., and Kirchner, F. (2017). Online Adaptive Modeling for Fault Detection of Underwater Thrusters. In 14th European Workshop on Advanced Control and Diagnosis.
- Ogunmolu et al. (2016) Ogunmolu, O., Gu, X., Jiang, S., and Gans, N. (2016). Nonlinear Systems Identification Using Deep Dynamic Neural Networks. URL http://arxiv.org/abs/1610.01439.
- Samy et al. (2011) Samy, I., Postlethwaite, I., and Gu, D.W. (2011). Survey and Application of Sensor Fault Detection and Isolation Schemes. Control Engineering Practice, 19(7), 658–674. doi:10.1016/j.conengprac.2011.03.002.
- Wang (2017) Wang, Y. (2017). A New Concept using LSTM Neural Networks for Dynamic System Identification. 2017 American Control Conference (ACC), 5324–5329. doi:10.23919/ACC.2017.7963782.