I Prognostic Health Management in Aerospace Industry
The aerospace industry is one of the most heavily regulated industries. With such an emphasis on safety and product quality and with such economic and health consequences due to equipment failures effective maintenance process plays the crucial role in the product success. That is why it is very important to predict and prevent possible failures, reduce repair costs and increase fleet availability while adhering to the rules and procedures set out by the regulatory bodies.
This often leads to the maintenance departments performing more preventative work than it is necessary to increase assurance in equipment reliability, even if this extra precautions do not always provide any additional benefits.
Even more, during certification process of the aerospace equipment specific maintenance policies are being developed. Usually airlines are stick to this policies and do not take any actions to improve them.
In recent years big aircraft manufacturers and airline companies declared that Aircraft Prognostic Health Management (PHM) converts aircraft data into actionable information by leveraging deep engineering knowledge and in-service fleet experience, and provides great possibilities to
Determine the operational status of the equipment,
Evaluate present condition of the equipment,
Detect abnormal conditions in a timely manner,
Initiate actions to prevent possible outages,
The important features of the PHM application are that
Aircraft (A/C) data has a very complex structure:
high-dimensional time-series (dimension is usually more than several hundreds);
measurement rate could be very high (up to tens of thousands of observations for each flight); at the same time the measurement rate could be different for different parameters;
big volumes of data (a typical size of a historial data sample is measured in terabytes);
missing values, non-stationary noise;
complex hierarchical structure of a nomenclature of failure types;
complex structure and distributed nature of the corresponding data storage;
failures are rare events with adverse effects; at the same time classical statistical predictive models are ineffective for such events because of their rarity;
when predicting failures we have to provide a significant coverage of accurately predicted failures and at the same time have a very low false alarm rate.
As a result development of a full support automated system for the early warnings of possible costly faults and failure prediction is a very challenging task. That is why many applications in the field of A/C predictive maintenance are based on simple “threshold” monitoring rules capable of detecting only simple faults and having high false alarm rates. However, it is not enough for costly failures anticipation.
E.g. for each of Finnair’s eight aircrafts (A330 and A340) during 2012 due to problems with a bleed system about 20 hr. of delays occurred, which costs about 100 euros/minute. Not to pick on Airbus, but the manufacturer’s Airman aircraft monitoring system either provided warnings rather late or did not provide warnings at all . One of the reasons is a lack of efficient methods for failure prediction. Only after Finnair decided to put its faith in math and asked an engineering company that develops specific mathematical algorithms for improvement of industrial production, to attack the problem, Finnair got reliable service and improved company’s air fleet availability.
In this work we develop a methodology for building a predictive maintenance policy for complex systems such as aircraft engines. We will demonstrate on examples about real aircraft operations that the developed methodology can efficiently provide failure anticipation and warning monitoring function to decide whether an operability-related failure is present in the aircraft before a fault actually occurs.
The paper has the following structure. In Section II we describe the developed methodology. In Section III we describe the algorithm for event matching being on of the important parts of the proposed approach for failure prediction. In Section IV we provide description of the aircraft data. In Sections V and VI we describe results of application of the methodology for two use cases. We make conclusions in Section VII.
Ii General Methodology
Engineering equipment (say, aircraft engine) typically falls into a pre-failure state starting with some minor flaws, e.g. cracks or leaks, that evolve in time and can lead up to critical failure events such as complete engine destruction. The natural need of the maintenance engineers is to identify these flaws (anomalies) as early as possible and thus try to prevent or even avoid critical events or at least to prepare for the event on time.
In some cases, based on real-time sensor observations, it is possible to indirectly identify the anomalies in the system behavior related to the minor problems, since the observations being monitored undergo changes in their distributions in response to a change in the environment or, more generally, to changes in certain patterns. Here the development of accurate and reliable mathematical models and tools comes up to the stage.
In some industries, e.g. aviation, it is crucial to have models and decision-making strategies with a maximum predictive power and a strictly limited false alarm rate. Moreover, it is important to decompose a black-box predictive model to explain an engineer the obtained prediction, which gives her hints on how to act further.
Let us describe the main steps of the methodology, which grounds on the natural considerations about failure precursors, discussed above.
Step 1. Data filtering and normalization.
Step 2. System decomposition: partition of all the measured parameters into groups, such that the parameters within the group are the most dependent (for example, correlated), but the groups of the parameters are not significantly dependent. Usually such decomposition corresponds to a physical partitioning of the engineering system into weakly dependent parts corresponding to specific nodes of the engineering system. For the decomposition we can use methods for clustering and community detection in networks , and graph embedding approaches .
Step 3. Detection and classification of various types of anomalies in combinations of observed physical parameters within each of the clustered groups of dependent parameters. The occurrence of an anomaly within the group of dependent parameters indicates a change in the dependencies between these parameters, which in turn means a change in the mode of operation of the corresponding part of the engineering system described by this group of parameters. Thus such anomaly can be a precursor of a future failure of the entire system. Due to the wide variety of data types, it is necessary to use various methods for anomaly detection:
some sensor data is represented in time-series format, so we can detect sequences of anomalies in streams of sensor data using [9, 10, 11, 12, 13, 14, 15], and then we can construct ensembles for rare events prediction [16, 17, 18, 19, 20] using detected anomalies and their features as precursors of major failures to optimize specific detection metrics similar to the one used in ;
historical sensor data has a kind of spatial dimension, since different time-series components correspond to different nodes of the engineering system; thus a graph of dependencies between streams of data, registered by different sensors, can be constructed and modern methods for graph feature learning [8, 25]
and panel time-series feature extraction[26, 27, 28] can be applied to enrich the set of input features, used for predictive model construction.
Step 4. Associating the detected anomalies with the subsequent (in future flights) failures of specific A/C subsystems on available historic data. At this stage a stream of historic telemetry data is represented by a stream of events (anomalies), detected in each of the groups of dependent parameters, extracted in Step 2. The hypothesis is that the appearance of particular combinations of anomalies in some of the selected groups of dependent parameters manifests changes in operating modes of specific components of the engineering system, which in turn lead to a failure in the near future. This hypothesis is tested on historical data, which should contain examples of failures that should be predicted. To test the hypothesis we can use methods of imbalanced classification [29, 30], as well as greedy events matching algorithms. The purpose of these algorithms is to identify subsets of events (anomalies in our case), detection of which reliably predicts some future failures.
Step 5. Built the final model for predicting failures, consisting of several decision rules:
For each group of parameters, extracted in Step 2, we apply the selected set of anomaly detection methods;
For the obtained set of anomalies we check whether there is such subsequence of anomalies among the detected ones, which precedes a failure with a high probability (according to the historical data);
We note that such model allows us to explain the “cause” of a particular forecast: to identify the input parameters that most affected the forecast. Indeed, a failure is predicted when a certain combination of anomalies is detected; these anomalies correspond to specific groups of parameters. They, in turn, can be associated with specific nodes of the engineering system.
Step 6. Verification of the constructed decision rules based on the cross-validation technique:
The available historical sample of observations is divided into parts w.r.t. measurements from different aircrafts;
The predictive model is trained on all data except the data corresponding to one of the aircrafts;
The accuracy of the trained model prediction is estimated on the data that is not used to train the model;
Actions 6.b and 6.c are repeated the number of times equal to the number of different aircrafts from which the historial data was collected;
The values of model prediction accuracy metrics are aggregated (for example, averaged).
This methodology is quite universal and can be applied to various engineering technical systems.
Iii Event Matching
The goal of an event matching algorihm is to find precursors to failure events of interest in the form of sequences of anomalies.
Let us describe the proposed event matching algorithm. We denote by
— alarms (based on anomalies) (), — their firing times, — failures,
— predictive window,
— horizon (how many moments prior to the onset of an event an alarm is considered anticipatory),— maintenance action effect delay.
To construct alarms we apply different anomaly detection algorithms, see Step 3 in Section II. Then for the alarm with respect to the failure event in we count
true signals: they fire timely, i.e. ;
irrelevant signals due to the event onset or maintenance: they fire too early, i.e. ;
false signals: they fire too early, i.e. .
We estimate the predictive performance using the following quantities:
and — true and false alarm periods of the target event respectively;
and — total number of true and false alarm firings;
and — unique true and false alarm firings.
Thus as efficiency metrics of an early warning system we use
false alarm rate (precision) ;
ratio of covered events (sensitivity) ;
the false alarm ratio .
Let us note here that we use total false alarm, and unique true alarm counts. In Fig. 1 we provide an illustration, explaning these metrics.
Now let us describe the proposed selection strategy, used to extract sequences of anomalies (predictive anomalies) that can be utilized as sufficiently accurate precursors of the failure of interest.
We use two approaches, namely, hard filter and soft filter:
First, we perform t-test: we consider some alarm to be promising for the prediction if and the p-value of the hypothesis vs. the hypothesis is below the significance level with , ;
Hard filter: we select the alarm to be used to predict a failure whenever and for a threshold controlling the support of the hypothesis, i.e. we define the implication ;
Soft filter: we select the alarm to be used to predict the failure if for a threshold controlling the false-to-covered ratio.
We generate final alarm signals in the following way. For each target event we get pairs of predictive anomalies . Here if and only if is considered to be predictable by the signal . In some cases we consider not pairs of predictive anomalies but triples. Alarm signals for the event are synthesized with
since pooling predictive anomalies with low false alarm rate increases the chances of successfully anticipating an event. Alarm signals in (1) play a role of precursors for the failure of interest.
Finally, we perform alarm signals synthesis:
Group parameters with respect to their semantics, or dependence graph. As a dependence measure we use a simple Pearson correlation, or more complex non-linear measures like mutual information;
For each group of parameters we use anomaly detection algorithms to detect anomalies:
The most typical anomaly detection algorithms are based on manifold modeling approaches [31, 32, 33, 34]; yet another approach could be to construct a surrogate model [35, 36, 37, 38, 39] in order to approximate dependencies between the observed parameters and then detect anomalies based on a predictive error with a non-parametric confidence measure [40, 41] as the diagnostic indicator;
In a linear case we can use the low rank linear PCA reconstruction error  as the diagnostic time-series;
Observations with errors, exceeding 90%-95% empirical quantile, are considered as anomalies.
Use the event matching algorithm to find pairs/triplets of predictive anomalies with adequate coverage and false alarm rates.
Iv Aircraft data
We test our methodology using telemetry of aircrafts. The data includes:
Multiple telemetry snapshots taken only under certain conditions,
Engine related ACMS reports — “Takeoff”(4), “Climb”(3), and “Cruise”(1,2), including parameters
EGTT: Exhaust Gas Temperature trimmed;
OPU: Engine oil pressure;
QDMCNT: Amount of captured abrasive particles in the oil flow;
Nacelle/turbine vibration, HP/LP turbine exit pressure, Exit thrust, ;
rep. have , , , and parameters resp.;
span a year and a half of operations, flights per year.
This data corresponds to the aircraft flight phases, depicted in Fig. 2.
The Engine Central Maintenance System logs track various events:
Fault codes: low-level indicators of unusual conditions in a circuit component or its subsystem;
Warnings: high-level indication of failures, time-outs, etc.:
7100w4X0: Engine stall (shutdown);
7400wXX0: Engine igniter A/B fault;
7830wXX0: Engine reverser thruster inhibited/unlocked.
4962W0X0: Auxiliary Power Unit Fault;
ATA / JASC code grouping (Joint Aircraft System/Component (JASC) Code Tables, Air Transport Association of America (ATA)):
7XXX – Turbine engine
73XX – engine fuel and control;
74XX – ignition,
77XX – oil filter clogging.
Data is higly imbalanced: most flights experience no warnings.
V Engine Shutdown prediction
This use case concentrates on the unexpected engine shutdown failure. Engine shutdown is an in-flight failure and it is reported as CMS messages with codes 7100W310, 7100W320, 7100W330, and 7100W340. In these codes, the last but one digit corresponds to the number of engine failed.
The goal is to predict future occurrences of engine shutdown critical failure using historical data from ACMS and CMS reports of 32 aircrafts. For this study we used the set of reports (1–4); here it is important to distinguish between different flight phases. Only engine related features were employed (refer to Step 2 in Sec. II and Sec. IV).
Fortunately, unexpected engine shutdowns are exceptionally rare: during only flights out of shutdown events were encountered, and for only aircrafts out of .
We set the parameters of the early warning system (see Sec. III) as follows: horizon , window flights, and the maintenance effect .
Due to the extreme rarity of analyzed events, it was impossible to employ the automatic anomaly detection (see Step 3 in Sec. II) and the automated predictive alarms selection (see Sec. III). Also due to severity of the analyzed failure, the restrictions on the false alarm rate were less stringent.
In case of low-frequency events the probability estimates, used in Sec. III, are unreliable. Thus we can not apply the method of Sec. III straightforwardly and optimal features had to be hand-picked taking into account the prediction quality metric. As a result we acquired at phase “7.1” the following features: “R3::FFDP_B1x”, “R3::P25_H1x”, and “R3::ZTFEFA_K1x” from report “3”.
In order to perform Anomaly Extraction in this case the lower dimensional linear data manifold was learnt over the sample of flights which were sufficiently separated from shutdown events. This sample, the so called “normal” regime, is consists of
all historical flights of an aircraft, which never encountered a shutdown event;
all flights of aircrafts, which did encounter the event, that are at least moments before or moments after the shutdowns.
This sample represents the normal regime of coupling of parameters between engines, which is why it is used to estimate a rank-1 approximation of the parameter group’s intrinsic linear manifold. The abnormality scores are calculated as reconstruction error of the recorded within group measurements of the complete sample of flights of all aircrafts, see Figure 3.
|Phase||Feature (report 3)||Threshold|
Vi Oil filter clogging prediction
The problem is to predict oil filter clogging in order to optimize maintenance of an oil filter (minimization of the number of inspections and cost of supplies). To do this we have to automatically extract parameters tied with the oil filter clogging event and to construct a model for prediction of the filter clogging using observed data. The prediction problem is imbalanced as the number of failures (filter cloggings) is small compared to the number of examples of the normal regime.
We considered a subset of parameters (that could be the most relevant to the failure) and applied the automatic parameter selection methodology together with the imbalanced classification [30, 29]. The successive elimination of parameters finally left only several parameters, including OPU (Oil pressure) etc., which occurred to be the most related to the failure.
At the next step we divided the data into the train/test sets and applied the failure prediction procedure that accounts for the autoregression depth of parameters values used when making predictions; we tune the depth values based on the prediction error. It occurred that the history up to flights back provides the most accurate predictions.
To understand benefits of the proposed approach we compared it with the simple thresholding algorithm, which predicts failures by comparing the level of OPU with some predefined threshold. The results for the different horizon of prediction along with the simple thresholding prediction based on OPU level are presented in Figures 5 and 6. The proposed approach exposes better metrics compared to the simple thresholding, thus allowing to predict failure events earlier with the same level of false alarms.
In practice the prediction of a failure in the vicinity of a real one is also acceptable and not considered as false alarm. The results for failures predicted in the range of flights near the real faults are presented in Figure 7.
In this paper we proposed a data-driven approach to the rare failure prediction problem. Thanks to the elaborated approach we were able to predict some possible aircraft equipment failures. The next steps would be to automate the methodology, as it still requires a manual choice of some hyperparameters (various thresholds), as well as to consider more use cases to expand the proposed methodology.
E. Burnaev would like to thank I. Nazarov and P. Erofeev for a help with data processing, and company Datadavnce llc. for the problem statement.
-  Maintenance optimization. Airplane health management, 2015. [Online]. Available: http://www.boeing.com/resources/boeingdotcom/commercial/services/assets/brochure/airplanehealthmanagement.pdf
-  T. Shen, F. Wan, W. Cui, and B. Son, “Application of prognostic and health management technology on aircraft fuel system,” in IEEE Proceedings of 2010 Prognostics and System Health Management Conference. Macao: IEEE, 12–14 Jan 2010, pp. 1–7.
-  J. Dai and H. Wang, “Evolution of aircraft maintenance and logistics based on prognostic and health management technology,” in Lecture Notes in Electrical Engineering. Proceedings of the First Symposium on Aviation Maintenance and Management-Volume II, vol. 297. Springer, 2014, pp. 665–672.
-  S. Alestra, C. Bordry, C. Brand, E. Burnaev, P. Erofeev, A. Papanov, and C. Silveira-Freixo, “Application of rare event anticipation techniques to aircraft health management,” Advanced Materials Research, vol. 1016, pp. 413–417, 2014.
-  ——, “Rare event anticipation and degradation trending for aircraft predictive maintenance,” in Proceedings of the joint WCCM – ECCM – ECFD 2014 Congress, 20-25 July, Barcelona, Spain, 2014, pp. 1–12.
-  L. Tegtmeier, “Math and maintenance,” Aviation Week and Space Technology, vol. 174, no. 39, 2012.
-  B. Saha, A. Mandal, S. Tripathy, and D. Mukherjee, “Complex networks, communities and clustering: A survey,” CoRR, vol. abs/1503.06277, 2015.
-  S. Ivanov and E. Burnaev, “Anonymous walk embeddings,” in Proc. of the 35th ICML, vol. 80. PMLR, 2018, pp. 2186–2195.
-  E. V. Burnaev and G. K. Golubev, “On one problem in multichannel signal detection,” Problems of Information Transmission, vol. 53, no. 4, pp. 368–380, 2017.
-  A. Artemov, E. Burnaev, and A. Lokot, “Nonparametric decomposition of quasi-periodic time series for change-point detection,” in Proc. SPIE, vol. 9875, 2015, pp. 9875–9875–5.
-  A. Artemov and E. Burnaev, “Optimal estimation of a signal perturbed by a fractional brownian noise,” Theory of Probability & Its Applications, vol. 60, no. 1, pp. 126–134, 2016.
-  ——, “Detecting performance degradation of software-intensive systems in the presence of trends and long-range dependence,” in 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), 2016, pp. 29–36.
-  D. Volkhonskiy, E. Burnaev, I. Nouretdinov, A. Gammerman, and V. Vovk, “Inductive conformal martingales for change-point detection,” in Proceedings of the Sixth Workshop on Conformal and Probabilistic Prediction and Applications, vol. 60. PMLR, 2017, pp. 132–153.
-  A. Safin and E. Burnaev, “Conformal kernel expected similarity for anomaly detection in time-series data,” Advances in Systems Science and Applications, vol. 17, no. 3, pp. 22–33, 2017.
-  V. Ishimtsev, A. Bernstein, E. Burnaev, and I. Nazarov, “Conformal k-nn anomaly detector for univariate data streams,” in Proceedings of the Sixth Workshop on Conformal and Probabilistic Prediction and Applications, vol. 60. PMLR, 2017, pp. 213–227.
-  A. Artemov and E. Burnaev, “Ensembles of detectors for online detection of transient changes,” in Proc. SPIE, vol. 9875, 2015, pp. 9875 – 9875 – 5.
-  D. Smolyakov, N. Sviridenko, V. Ishimtsev, E. Burikov, and E. Burnaev, “Learning Ensembles of Anomaly Detectors on Synthetic Data,” arXiv e-prints, vol. abs/1905.07892, 2019.
-  A. Korotin, V. V’yugin, and E. Burnaev, “Aggregating strategies for long-term forecasting,” in Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications, vol. 91. PMLR, 2018, pp. 63–82.
-  A. Korotin, V. V’yugin, and E. Burnaev, “Adaptive Hedging under Delayed Feedback,” arXiv e-prints, vol. abs/1902.10433, 2019.
-  ——, “Long-Term Online Smoothing Prediction Using Expert Advice,” arXiv e-prints, vol. abs/1711.03194, 2017.
E. Burnaev, I. Koptelov, G. Novikov, and T. Khanipov, “Automatic construction of a recurrent neural network based classifier for vehicle passage detection,” inProc. SPIE, vol. 10341, 2017, pp. 10 341–10 341–6.
-  E. Burnaev and D. Smolyakov, “One-class svm with privileged information and its application to malware detection,” in 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), 2016, pp. 273–280.
D. Smolyakov, N. Sviridenko, E. Burikov, and E. Burnaev, “Anomaly pattern recognition with privileged information for sensor fault detection,” inArtificial Neural Networks in Pattern Recognition. Springer, 2018, pp. 320–332.
-  E. Burnaev, P. Erofeev, and D. Smolyakov, “Model selection for anomaly detection,” in Proc. SPIE, vol. 9875, 2015, pp. 9875 – 9875 – 6.
-  S. Ivanov, N. Durasov, and E. Burnaev, “Learning node embeddings for influence set completion,” in Proc. of IEEE International Conference on Data Mining Workshops (ICDMW), 2018, pp. 1034–1037.
-  R. Rivera, P. Pilyugina, A. Pletnev, I. Maksimov, W. Wyz, and E. Burnaev, “Topological data analysis of time series data for b2b customer relationshop management,” in Proc. of Industrial Marketing and Purchasing Group Conference (IMP19), ser. The IMP Journal, 2019.
-  R. Rivera-Castro, I. Nazarov, Y. Xiang, A. Pletneev, I. Maksimov, and E. Burnaev, “Demand forecasting techniques for build-to-order lean manufacturing supply chains,” arXiv e-prints, vol. abs/1905.07902, 2019.
-  R. Rivera, I. Nazarov, and E. Burnaev, “Towards forecast techniques for business analysts of large commercial data sets using matrix factorization methods,” Journal of Physics: Conference Series, vol. 1117, no. 1, p. 012010, 2018.
-  D. Smolyakov, A. Korotin, P. Erofeev, A. Papanov, and E. Burnaev, “Meta-learning for resampling recommendation systems,” in Proc. SPIE 11041, Eleventh International Conference on Machine Vision (ICMV 2018), 110411S (15 March 2019), 2019.
-  E. Burnaev, P. Erofeev, and A. Papanov, “Influence of resampling on accuracy of imbalanced classification,” in Proc. SPIE, vol. 9875, 2015, pp. 9875–9875–5.
-  A. Kuleshov, A. Bernstein, E. Burnaev, and Y. Yanovich, “Machine learning in appearance-based robot self-localization,” in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017, pp. 106–112.
-  A. Kuleshov, A. Bernstein, and E. Burnaev, “Conformal prediction in manifold learning,” in Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications, vol. 91. PMLR, 2018, pp. 234–253.
-  ——, “Manifold learning regression with non-stationary kernels,” in Artificial Neural Networks in Pattern Recognition. Springer, 2018, pp. 152–164.
——, “Kernel regression on manifold valued data,” in
Proceedings of IEEE 5th International Conference on Data Science and Advanced Analytics, 2018, pp. 120–129.
-  M. Belyaev, E. Burnaev, E. Kapushev, M. Panov, P. Prikhodko, D. Vetrov, and D. Yarotsky, “Gtapprox: Surrogate modeling for industrial design,” Advances in Engineering Software, vol. 102, pp. 29–39, 2016.
-  E. V. Burnaev and P. V. Prikhod’ko, “On a method for constructing ensembles of regression models,” Automation and Remote Control, vol. 74, no. 10, pp. 1630–1644, 2013.
-  M. G. Belyaev and E. V. Burnaev, “Approximation of a multidimensional dependency based on a linear expansion in a dictionary of parametric functions,” Informatics and its Applications, vol. 7, no. 3, pp. 114–125, 2013.
-  E. Burnaev and A. Zaytsev, “Surrogate modeling of multifidelity data for large samples,” Journal of Communications Technology and Electronics, vol. 60, no. 12, pp. 1348–1355, 2015.
A. Zaytsev and E. Burnaev, “Large scale variable fidelity surrogate
Annals of Mathematics and Artificial Intelligence, vol. 81, no. 1, pp. 167–186, 2017.
E. Burnaev and I. Nazarov, “Conformalized kernel ridge regression,” in2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 2016, pp. 45–52.
-  E. Burnaev and V. Vovk, “Efficiency of conformalized ridge regression,” in Proceedings of The 27th Conference on Learning Theory, vol. 35. PMLR, 2014, pp. 605–622.
-  E. Burnaev and S. Chernova, “On an iterative algorithm for calculating weighted principal components,” Journal of Communications Technology and Electronics, vol. 60, no. 6, pp. 619–624, 2015.