EGFC: Evolving Gaussian Fuzzy Classifier from Never-Ending Semi-Supervised Data Streams – With Application to Power Quality Disturbance Detection and Classification

Power-quality disturbances lead to several drawbacks such as limitation of the production capacity, increased line and equipment currents, and consequent ohmic losses; higher operating temperatures, premature faults, reduction of life expectancy of machines, malfunction of equipment, and unplanned outages. Real-time detection and classification of disturbances are deemed essential to industry standards. We propose an Evolving Gaussian Fuzzy Classification (EGFC) framework for semi-supervised disturbance detection and classification combined with a hybrid Hodrick-Prescott and Discrete-Fourier-Transform attribute-extraction method applied over a landmark window of voltage waveforms. Disturbances such as spikes, notching, harmonics, and oscillatory transient are considered. Different from other monitoring systems, which require offline training of models based on a limited amount of data and occurrences, the proposed online data-stream-based EGFC method is able to learn disturbance patterns autonomously from never-ending data streams by adapting the parameters and structure of a fuzzy rule base on the fly. Moreover, the fuzzy model obtained is linguistically interpretable, which improves model acceptability. We show encouraging classification results.



There are no comments yet.


page 1


Adaptive Gaussian Fuzzy Classifier for Real-Time Emotion Recognition in Computer Games

Human emotion recognition has become a need for more realistic and inter...

Fuzzy Overclustering: Semi-Supervised Classification of Fuzzy Labels with Overclustering and Inverse Cross-Entropy

Deep learning has been successfully applied to many classification probl...

Beyond Cats and Dogs: Semi-supervised Classification of fuzzy labels with overclustering

A long-standing issue with deep learning is the need for large and consi...

Unsupervised Fuzzy eIX: Evolving Internal-eXternal Fuzzy Clustering

Time-varying classifiers, namely, evolving classifiers, play an importan...

Real-Time Anomaly Detection in Data Centers for Log-based Predictive Maintenance using an Evolving Fuzzy-Rule-Based Approach

Detection of anomalous behaviors in data centers is crucial to predictiv...

Drift anticipation with forgetting to improve evolving fuzzy system

Working with a non-stationary stream of data requires for the analysis s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Power system disturbance detection and classification are broad and difficult research issues. Detection and classification are generally handled by using machine learning and computational intelligence methods

[1, 2, 3]. The number of correlated attributes potentially carrying important information about disturbances is high. As real-time monitoring of all attributes is infeasible, a method to extract the most prominent ones to assist detection is fundamental. A further issue concerns the frequent occurrence of new situations. The emergence of new patterns in power system data affects the performance of classification systems. Novelties may arise since the involved signals are mutually related and time-varying. Moreover, the superposition of different events in different intensities may generate never-before-seen data, which tends to confuse conventional offline-trained classification systems [4, 5].

Online disturbance-detection methods deal with the occurrence of new patterns in data streams. New behaviors should be captured by adaptive or evolving models, namely, models supplied with incremental learning algorithms [6, 7, 8, 9]. Generally, the amount of data in power applications is large. Therefore, storing the data for further offline development of equation-based or intelligent models, and statistical analyses, is quite often time-consuming, useless, or even impossible. Adaptive and evolving modeling from online data streams are distinct concepts. Adaptive models (parametrically adaptive models from the control theory) are suitable to cope with smooth, gradual changes of system parameters and statistical properties of data (concept drift). However, when an adaptive model is changed to learn a new behavior, the knowledge about some previous behaviors tends to be partially lost. This phenomenon is known as catastrophic forgetting [10]. Abrupt changes of the values of a variable or parameter (concept shift) require both parametrical and structural adaptation of models. Such higher level of flexibility of models, which includes the incremental update of the model structure, outlines a broad research area known as evolving intelligence [11, 12, 13, 14].

Evolving fuzzy rule-based models are appropriate for online detection and classification in nonstationary data stream environments, such as that found in power systems [5, 6, 8]. Incremental fuzzy clustering algorithms have been used for constructing rule-based evolving classifiers. These algorithms are capable of determining the classifier structure and parameters from scratch based on online data. Development and use of rule-based evolving systems have grown in the last decade. Successful applications of these systems in complex real-world problems, including control, prediction, classification, identification, and function approximation, are found [4, 11, 12, 13, 14, 15]. A further advantage of evolving fuzzy systems is that they may provide linguistically-appealing granular information [13, 14, 16], that is, these systems may explain their results or actions. Online structural adaptation of a fuzzy model to handle nonstationarities is pursued by adding, merging, and removing rules from a knowledge base [4].

This paper addresses a new fuzzy modeling framework we called Evolving Gaussian Fuzzy Classification (EGFC) framework. EGFC is a semi-supervised variation of evolving granular rule-based approach [13, 17]

for the construction of nonlinear and time-varying classifiers – being unsupervised and supervised learning the boundary cases. We aim to detect and classify anomalies, broadly speaking, and power quality disturbances as a particular application example. An EGFC model employs Gaussian membership functions to associate numerical input data with classes. Its incremental learning algorithm provides a dynamic classifier with simple math and linguistic rules describing its decisions. The set of EGFC rules represents the essence of a data stream. From a point of view, the EGFC approach consists in looking to stream data, and deciding between coarser or more detailed granules to achieve a better classification accuracy, and provide decision making support. Spikes, notching, harmonics, and oscillatory-transient types of disturbances in power systems are taken into consideration in the present study.

A hybrid method to extract highly discriminative attributes to be used as inputs of the EGFC model is also addressed. The method combines a Hodrick-Prescott (HP) filter [18] and Discrete Fourier Transform (DFT) [19] applied over a landmark window of voltage data to provide attributes that help disturbance classification. In particular, the method is different from those addressed in related power-system literature as HP analysis provides a smooth nonlinear representation that is sensitive to long and short-term changes. In other words, a series of smooth nonlinear trends are obtained, whereas the information removed from the original data is maintained separately and can be accessed for analysis. In comparison to S-transform-based attribute extraction methods [2, 20], often used in related literature, the proposed hybrid HP-DFT method has proven to be faster and effective – important characteristics for high-frequency data-stream processing.

The rest of this paper is organized as follows. Section II describes the HP and DFT methods for attribute extraction. Section III

presents the data-stream-based semi-supervised learning algorithm and the Evolving Gaussian Fuzzy Classifier, EGFC, proposed for a broad class of classification and online anomaly-detection problems from numerical data. Section

IV describes the methodology for generating the data and developing the classifier. Results are given in Section V. The conclusion is outlined in Section VI.

Ii Attribute Extraction

We describe an approach to extract attributes that indicate the occurrence of disturbances in voltage data. Fundamentally, an HP filter and the DFT are applied to raw voltage data within a time window. A more discriminative set of attributes facilitates model interpretation, reduces data overfitting, and may produce better results due to the elimination of attributes and noise that may mislead online systems [16, 13].

Ii-a Hodrick-Prescott Filter

An HP filter decomposes a signal, , into its trend, , and cyclical and random components, , such that . It is equivalent to a cubic spline smoother, with the smoothed portion in [18]. In essence, low-frequency fluctuations are separated from the original data. The separation hypothesis is that the low-frequency variability represents the long-term trend, whereas the high-frequency variability means random phenomena. The HP filter is for the first time used for power system disturbance classification in this paper.

The HP filter extracts the trend, which is stochastic, but with smooth variations over time that are uncorrelated with other variations. The idea is to minimize the functional


with respect to , in which , and , , denotes the underlying signal; ; is the lag operator, e.g., ; is the amount of data samples; and penalizes the variability of the trend component. Parameter is the smoothing parameter; it controls the variation of the growth rate of the trend. The first term of (1) is the sum of deviations of the signal concerning the square trend, a measure of the degree of fit. The second term is the sum of squares of the second differences of the trend component, a measure of the degree of smoothness. See [18] for details. The fourth section of this paper provides practical examples of the HP decomposition specifically for disturbances detection.

Ii-B Discrete Fourier Transform

The Fourier Transform is one of the most used frequency-domain signal processing tools. The idea is that any periodic signal can be described by a sum of sines and cosines

[19]. When measured data are available, a frequency spectrum is generated by discrete Fourier transformation from


in which is a discrete signal; corresponds to time intervals; and is the number of samples – being the total time interval. Additionally, , , are the frequency components.

The discrete Fourier transform is an invertible linear transformation. The inverse is given by


See [19] for details on discrete Fourier transforms.

Ii-C Root Mean Square Voltage

The effective (RMS) value is a measure of the magnitude of a variable quantity. RMS values can be calculated for a sequence of discrete values. The effective voltage of an alternating current circuit may provide evidence of some types of disturbances, e.g., sag, swell, and interruptions.

The RMS value of a sinusoidal waveform x for a set of samples, , is


This is an equivalent direct value able to produce the same power as that of the original waveform.

Iii Evolving Gaussian Fuzzy Classifier from Never-Ending Semi-Supervised Data Streams

Iii-a Preliminaries

We present EGFC, a semi-supervised evolving classifier derived from the online granular-computing framework of Leite et al. [13, 17]. EGFC employs Gaussian membership functions to cover the data space with fuzzy granules (local models), and associate new numerical data to class labels. Granules are scattered in the data space wherever needed to represent local information. EGFC overall response comes from the fuzzy aggregation of local models. A recursive algorithm constructs its rule base, and updates local models to deal with novelties. EGFC addresses issues such as unlimited amounts of data and scalability [13].

Local EGFC models are created if the new data are sufficiently different from the current knowledge. The learning algorithm can expand, retract, delete, and merge granules on occasion. Rules are reviewed according to inter-granular relations. EGFC provides nonlinear, nonstationary, and smooth boundaries among classes. This paper particularly addresses a 5-class disturbance classification problem.

Formally, let an input-output pair be related through . We seek an approximation to

to estimate the value of

given x. In classification, is a class label, a value in a set , and specifies class boundaries. In the more general, semi-supervised case, may or may not be known when x arrives. Classification of never-ending data streams involves pairs of time-sequenced data, indexed by . Nonstationarity requires evolving classifiers to identify time-varying relations .

Iii-B Gaussian Functions and Rule Structure

Learning in EGFC does not require initial rules. Rules are created and dynamically updated depending on the behavior of a system over time. When a data sample is available, a decision procedure may add a rule to the model structure or update the parameters of a chosen rule.

In EGFC models, a rule is

IF is AND … AND is


in which , , are input attributes, and is the output (a class). The data stream is denoted Moreover, , ; , are Gaussian membership functions built from the available data; and is the class label of the -th rule. Rules , , form the rule base. The number of rules, , is variable, which is a notable characteristic of the approach since guesses on how many data partitions exist are needless [4, 13].

A normal Gaussian function, , has height 1 [16]. It is characterized by the modal value and dispersion . Characteristics that make Gaussians appropriate include: (i) easiness of learning and changing, i.e., modal values and dispersions are updated straightforwardly from a data stream; (ii) infinite support, i.e., since the data are priorly unknown, the support of Gaussians extends to the whole domain; and (iii) smooth surface of fuzzy granules, , in the -dimensional Cartesian space – obtained by the cylindrical extension of one-dimensional Gaussians, and the use of the minimum T-norm aggregation [16].

Iii-C Adding Rules to the Evolving Fuzzy Classifier

Rules may not exist a priori. They are created and evolved as data are available. A new granule and the rule that governs the granule are created if none of the existing rules are sufficiently activated by ; i.e., brings new information. Let be an adaptive threshold that determines if a new rule is needed. If


in which is any triangular norm, then the EGFC structure is expanded. The minimum (Gödel) T-norm is used in this paper. If is equal to 0, then the model is structurally stable, and unable to follow concept shifts. In contrast, if is equal to 1, EGFC creates a rule for each new sample, which is not practical. Structural and parametric adaptability are balanced for intermediate values (stability-plasticity tradeoff) [21].

The value of is crucial to regulate how large granules can be. Different choices impact the accuracy and compactness of a model, resulting in different granular perspectives of the same problem. Section III-E gives a Gaussian-dispersion-based procedure to update .

A new granule is initially represented by membership functions, , , with




We call (7) the Stigler approach to standard Gaussian functions, or maximum approach [22, 23]. The intuition is to start big, and let the dispersions gradually shrink when new samples activate the same granule. This strategy is appealing for a compact model structure.

In general, the class of the rule is initially undefined, i.e., the -th rule remains unlabeled until a label is provided. If the corresponding output, , associated to , becomes available, then


Otherwise, the first labeled sample that arrives after the -th time step, and activates the rule according to (5), is used to define its class, .

In case a labeled sample activates a rule that is already labeled, but their labels are different, then a new (partially overlapped) granule and a rule are created to represent new information. Partially overlapped Gaussian granules tagged with different labels tend to have their dispersions reduced over time by the parameter adaptation procedure (Sec. III-D). The modal values of the Gaussian granules may also drift, if convenient for a more suitable decision boundary.

With this initial parameterization, preference is given to granules balanced along their dimensions, rather than granules with unbalanced geometry. EGFC realizes the principle of the balanced granularity [24], but allows the Gaussians to find more appropriate places and dispersions.

Iii-D Incremental Parameter Updating

Updating the EGFC model consists in: (i) reducing or expanding Gaussians , , of the most active granule, , considering labeled and unlabeled samples; (ii) moving granules toward regions of relatively dense population; and (iii) tagging rules when labeled data are available. Adaptation aims to develop more specific local models [25], and provide pavement (covering) to new data.

A rule is candidate to be updated if it is sufficiently activated by an unlabeled sample, , according to


Geometrically, belongs to a region highly influenced by the granule . Only the most active rule, , is chosen for adaptation in case two or more rules reach the level for the unlabeled . For a labeled sample, i.e., for pairs , the class of the most active rule , if defined, must match . Otherwise, the second most active rule among those that reached the level is chosen for adaptation, and so on. If none of the rules are apt, a new one is created (Sec. III-C).

To include in , EGFC’s learning algorithm updates the modal values and dispersions of the corresponding membership functions , , from




in which is the number of times the rule was chosen to be updated. Notice that (10)-(11) are recursive and, therefore, do not require data storage. As defines a convex region of influence around , very large and very small values may induce, respectively, a unique or too many granules per class. An approach is to keep between a lower, , and the Stigler, , limits.

Iii-E Dispersion-Based Time-Varying -Level

Let the activation threshold, , be time-varying. The threshold assumes values in the unit interval according to the overall average dispersion


and are the number of rules and attributes, so that


As mentioned, rules’ activation levels for an input

are compared to to decide between parametric or structural changes of an EGFC model. In general, EGFC starts learning from an empty rule base, and without knowledge about the properties of the data. Practice suggests as starting value. The threshold tends to converge to a proper value if the classifier structure achieves a level of maturity and stability. Nonstationarities and new classes guide to values that better reflect the needs of the current environment.

Iii-F Merging Similar Granules

Similarity between two granules with the same class label may be high enough to form a unique granule that inherits the essence of both. Analysis of inter-granular relations requires a distance measure between Gaussians. Let


be the distance between and . This measure considers the information specificity, that is, in turn, inversely related to the Gaussians’ dispersion [23]. For example, if the dispersions and differ one from another, rather than being equal, the distance between the underlying Gaussians is larger.

EGFC may merge the pair of granules that presents the smallest value of for all pairs of granules. Both granules must be either unlabeled or tagged with the same class label. The merging decision is based on a threshold value, , or expert judgment regarding the suitability of combining such granules to have a more compact model. For data within the unit hypercube, we suggest as default, which means that the candidate granules should be quite similar.

A new granule, say , which results from and , is built by Gaussians with modal values


and dispersion


These relations take into account the granular uncertainty to find an appropriate location and size to the resulting granule. Merging minimizes redundancy [4, 13].

Iii-G Deleting Rules

A rule is removed from the EGFC model if it is inconsistent with the current environment. In other words, if a rule is not activated for a number of iterations, say , then it is deleted from the rule base. However, if a class is rare, e.g., a type of power quality disturbance is unusual, then it may be the case to set to infinity and keep the inactive rules. Removing rules periodically helps to keep the model updated.

Iii-H Semi-Supervised Learning from Data Streams

The semi-supervised learning procedure to construct and update EGFC models along their lifespan is given below.

  EGFC: Online Semi-Supervised Learning  

1:  Initial number of rules, ;
2:  Initial meta-parameters, , ;
3:  Read input data sample ;
4:  Create granule (Eqs. (6)-(7)), unknown class ;
5:  FOR = 2, … DO
6:     Read , calculate rules’ activation degree (Eq. (5));
7:     Determine the most active rule ;
8:     Provide estimated class ;
9:     // Model adaptation
10:     IF
11:       IF actual label is available
12:         Create labeled granule (Eqs. (6)-(8));
13:       ELSE
14:         Create unlabeled granule (Eqs. (6)-(7));
15:       END
16:     ELSE
17:       IF actual label is available
18:         Update the most active granule whose class        is equal to (Eqs. (10)-(11));
19:         Tag unlabeled active granules;
20:       ELSE
21:         Update the most active (Eqs. (10)-(11));
22:       END
23:     END
24:     Update the -level (Eqs. (12)-(13));
25:     Delete inactive rules based on ;
26:     Merge granules based on (Eqs. (14)-(16));
27:  END


Iv Methodology

We describe the methodology to generate power system disturbances. We give examples of disturbances, and a flowchart that connects the DFT-HP attribute extraction and EGFC.

Iv-a Online Monitoring System

Voltage data from a 13.8kV grid are produced according to the IEEE standard [26]

. The fundamental and sampling frequencies are 60Hz and 15,360Hz. Thus, 256 samples per cycle are given. This sampling rate is sufficient to characterize most of the disturbances in power systems, including spikes, notching, harmonics, and oscillatory transient. Gaussian white noise is added to give different signal-to-noise (

) ratio,


in which is the amplitude of the original voltage signal;

is the standard deviation of the Gaussian noise; and dB means deciBel – one tenth of Bel.

Voltage data from power systems usually have an from 40 to 70dB. We evaluate the evolving classifier subject to 20, 40, and 60dB – being 20dB the harshest stochastic scenario. Ten thousand voltage waveforms are generated:

Class 1: 2,000 waveforms without disturbances;

Class 2: 2,000 waveforms with spikes;

Class 3: 2,000 waveforms containing notching;

Class 4: 2,000 waveforms with harmonics;

Class 5: 2,000 waveforms with oscillatory transient.

Time windows based on constant time intervals between landmarks are considered. Windows of different lengths (1, 4 and 10 cycles of the fundamental) are assessed in different experiments. The greatest peaks and valleys of the fundamental voltage in a window are rescaled in the range . A phase angle within [, ] is randomly assigned to the starting point of a waveform. Waveforms are subject to noise (17).

Voltage data within a window are fed to HP-DFT attribute extraction. Then, a vector of input data is formed and provided to the EGFC model. EGFC estimates a class, and then uses the input vector – accompanied or not by a label – to update its parameters and structure. For each window, this procedure is repeated. A general flowchart of the power quality monitoring and classification system is shown in Fig.


Fig. 1: Evolving Disturbance Detection and Classification System

Four disturbance indicators compose an input vector of the EGFC model. They are

: Amplitude of the fundamental (60Hz) voltage component, obtained by DFT over the data in a time window;

: Minimum value of the voltage cyclical component after HP decomposition over the data in a time window;

: Maximum value of the voltage cyclical component after HP decomposition over the data in a time window;

: Effective value of the voltage cyclical component after HP decomposition over the data in a time window.

The EGFC estimated output, , is a class . Constructing and updating an EGFC model is a fully online process based on partially-labeled data.

Iv-B Generating Disturbances

Disturbances, viz., spike, notching, harmonics, and oscillatory transient (common power-system phenomena), are added to the fundamental voltage subject to an .

Spikes or surges are fast, short-duration voltage transients caused by lightning strikes, power outages, short circuits, power transitions in large equipment, to mention some [27]. A spike usually lasts from 1 to 30, and may reach over 1,000V. For example, a motor when switched off can generate a spike of 1,000V. Spikes can degrade wiring insulation and destroy electronic devices. Some common-mode voltage spikes may not be detected by surge protection equipment.

To generate spike we choose a random starting point during the first voltage cycle. The spike peaks after 10 samples, and extinguishes after 20 samples. Its maximum amplitude is a random number in pu or pu. The spike repeats in subsequent cycles for window lengths larger than one cycle, see example in Fig. 2.

Notching is a periodic disturbance, a switching lasting less than 0.5 cycles. It is caused by the normal operation of electronic devices and three-phase converters. Notches occur when the current commutates from one phase to another. The severity of a notch is given by the source and isolating inductances of a converter, the magnitude of the current, and the point being monitored. The frequency components associated with notching can be quite high [28]. To generate notching we choose a random starting sample in [10, 40]. The disturbance extinguishes after 9 samples; it repeats 8 times per cycle, 23 samples after the previous occurrence. The maximum amplitude is pu or pu, see Fig. 2.

Harmonics are sinusoidal voltages with frequencies that are integer multiples of the fundamental. Distortion arises as current sources that inject harmonic currents into the power system cause nonlinear voltage drops across the system impedance. Harmonic currents result from the normal operation of nonlinear electronic devices and loads on the system. Harmonic distortion is a growing concern for customers and for the overall power system due to an increasing number of power electronics equipment [26]. To generate harmonics, random values in pu, pu, pu, pu, pu, and pu are chosen, respectively, for the second to the seventh harmonic. The start point of each harmonic is independent one another and may have any phase angle in , see Fig. 2.

Fig. 2: Examples of 4-cycle voltage waveforms of each class, with . From top to bottom: no disturbance (class 1); spikes (class 2); notching (class 3); harmonics (class 4); oscillatory transient (class 5)

Oscillatory transient is a sudden change of the voltage steady-state condition that includes rapid changes of positive and negative polarity values. Transients are almost always due to some type of switching event. Power electronic devices can produce oscillatory transients as a result of commutation and RLC snubber circuits [26]. We choose a random sample in a time window as start point of a transient. The transient is an exponentially damped sinusoid whose start amplitude is in pu in relation to the fundamental component. Its frequency is a random value in Hz. The damping coefficient is a random value in , see Fig. 2.

Iv-C Classification Accuracy

Classification accuracy is computed recursively from


in which ; if (right estimate). Otherwise, (wrong class estimate).

The average number of granules or rules over time, , is a measure of model concision. Recursively,


V Results and Discussions

We evaluate the EGFC approach. No prior knowledge about the data and power system is assumed. Classification models are developed from scratch, based on data streams.

V-a Preliminary Results on Feature Extraction

We use DFT and HP filtering to extract four disturbance indicators from the voltage waveform. The DFT provides , the amplitude of the fundamental component. The HP filter decomposes the original waveform into trend and cyclical components. We focus on the cyclical component to obtain , and , i.e., the minimum, maximum and effective values of the decomposed waveform. We achieved encouraging results using an HP smoothing coefficient of as a great portion of noise is isolated from the fundamental. Figure 3 shows typical values of attributes, , extracted from waveforms of each disturbance class. The examples consider 4-cycle time windows and an of 30.

Notice in Fig. 3 that the value of changes slightly, up and down, respectively, in the spike and notching scenarios, which may help the evolving classifier to distinguish these classes. The HP cyclical component for the case without disturbance shows an initial transient such that the absolute values of and are significantly different one another, which helps the classifier to recognize this class. This phenomenon also happens for the spike scenario, with greater unbalance between the values of and . Attribute tends to zero rapidly for the notching case. The absolute values of and in the harmonic scenario are similar. The same happens in the oscillatory transient case, but with higher individual amplitudes. Therefore, x carries important subtleties.

Fig. 3: Typical examples of attributes extracted from 5 different voltage waveforms using DFT (left column) and HP filter (right column) considering 4-cycle time windows and an of 30. From top to bottom: row 1 - no disturbance (class 1); row 2 - spikes (class 2); row 3 - notching (class 3); row 4 - harmonics (class 4); row 5 - oscillatory transient (class 5)

V-B EGFC Results for Labeled Data Streams

We look for an evolving classifier based on a data stream. The default meta-parameters are used (Sec. III-H). Table I shows the results averaged over 5 runs for 9 datasets extracted from voltage waveforms based on window lengths of 1, 4, and 10 cycles; and an of 20, 40, and 60dB. Each dataset consists of 10,000 4-attribute samples related to a target class . The classes mean ‘no disturbance’, ‘spikes’, ‘notching’, ‘harmonics’, and ‘oscillatory transient’.

Table I shows that the is irrelevant to the classifier performance using the set of attributes x chosen. The accuracy can be relatively higher in noisier conditions, e.g. 20dB. This is an interesting feature of the proposed monitoring system. In contrast, a very small window length can degrade system performance significantly. The 4-cycle scenario seems more attractive than the 10-cycle one as the system is able to analyze a higher amount of windows at the price of a small reduction of the classification accuracy. As at least two 4-cycle windows are processed during a 10-cycle period, if the system provides wrong classification for the data extracted from the first window, it can still detect the disturbance class from the data of the other window. Therefore, in practice, the 92.79%-accuracy 4-cycle-based EGFC system can be more efficient than the 94.24%-accuracy 10-cycle-based one. The number of rules in the model structure over the learning steps, and the CPU time in a quad-core i7-8550U with 1.80GHz and 8GB of RAM are similar in all scenarios.

Cycles (%) # Rules Time (s)
60dB 4
40dB 4
20dB 4
TABLE I: EGFC Performance in Multiclass Classification of Power System Disturbances (99% Confidence)

Figure 4 shows a typical example of evolution of the -level, accuracy, and number of EGFC rules. The final granules, at , are also illustrated. Class-2 data (spike disturbance) spread over a larger area, and require four granules to be represented, whereas the remaining classes require a single granule. Figure 5

shows the confusion matrix obtained. Class 1 (no disturbance) and Class 4 (harmonics), followed by Class 4 and Class 5 (transient), are those responsible for the 7.7% overall error. Additional attributes should be considered in the future to address these particular hesitancies.

Fig. 4: Evolution of the level, EGFC accuracy, and number of rules. The bottom plots show the final shape of 4-dimensional Gaussian granules
Fig. 5: Typical EGFC confusion matrix based on a 4-cycle time window, an of dB, and labeled data stream

V-C EGFC Result in Semi-Supervised Online Scenario

We changed the proportion of unlabeled data from 0% to 100% considering the harshest 20dB problem. Figure 6 shows average EGFC results for 5 runs for each case. EGFC benefits of all information of the data stream, including that from unlabeled samples. Conventional and evolving classifiers that operate on a supervised basis by simply discarding unlabeled data cannot deal with small fractions of labeled data with reasonable accuracy (as shown by the right-side points of the graph). The left and right extremes of the plot indicate full supervision and non-supervision. In all cases the final result is a partition of data into classes. EGFC is not significantly affected by fractions of unlabeled data. For data extracted from 4-cycle windows, the performance of pure classification and clustering were and , respectively.

Fig. 6: EGFC performance using different proportions of unlabeled data

Vi Conclusion

We propose a hybrid attribute-extraction method combined with an evolving Gaussian fuzzy model for power quality disturbance detection and classification. A Hodrick-Prescott filter and the Discrete Fourier Transform applied over window of voltage data have provided attractive attributes for disturbance discrimination. Common types of disturbances, namely, spikes, notching, harmonics, and oscillatory transient were analyzed. Data generation agrees with the IEEE standard for power quality disturbances. A landmark window containing one, four, and ten voltage cycles as well as signal-to-noise ratio ranging from 20 to 60dB were evaluated. The evolving modeling approach, EGFC, has shown to be efficient for multi-class online classification. Its fuzzy rule-based structure, Gaussian membership functions, and granularity are updated over time driven by the data stream. Online model adaptation has shown to be essential to deal with time-varying systems, such as power systems subject to disturbance patterns.

Harmonics followed by oscillatory transient have shown to be more challenging to be distinguished compared to spikes and notching. The signal-to-noise ratio does not affect EGFC performance significantly. The use of 4 voltage cycles for attribute extraction provided an average EGFC accuracy of 92.8% in the harshest 20dB noise case, and therefore was considered ideal. Changing the proportion of unlabeled data from to made the EGFC performance reduce from to . Therefore, EGFC is applicable to and robust to clustering and classification. New types of disturbances can be studied in the future. The EGFC semi-supervised learning framework shall be analyzed considering synthetic examples of concept change and anomaly detection problems.


  • [1]

    P. D. Achlerkar, S. R. Samantaray, M. S. Manikandan. “Variational Mode Decomposition and Decision Tree Based Detection and Classification of Power Quality Disturbances in Grid-Connected Distributed Generation System.” IEEE T Smart Grid, 9(4), p. 3122-3132, 2018.

  • [2] J. Li, Z. Teng, Q. Tang, J. Song. “Detection and Classification of Power Quality Disturbances Using Double Resolution S-Transform and DAG-SVMs.” IEEE T Instrum Meas, 65(10), p. 2302-2312, 2016.
  • [3] A. S. Sobrinho, R. A. Flauzino, L. H. Liboni, E. C. Costa. “Proposal of a Fuzzy-based PMU for Detection and Classification of Disturbances in Power Distribution Networks.” Int J Elec Power, 94, p. 27-40, 2018.
  • [4] I. Skrjanc, J. A. Iglesias, A. Sanchis, D. Leite, E. Lughofer, F. Gomide. “Evolving Fuzzy and Neuro-Fuzzy Approaches in Clustering, Regression, Identification, and Classification: A Survey.” Information Sciences, 490, p. 344-368, 2019.
  • [5]

    D. Leite. “Comparison of Genetic and Incremental Learning Methods for Neural Network-Based Electrical Machine Fault Detection.” In: Lughofer E., Sayed-Mouchaweh M. (Eds), Predictive Maintenance in Dynamic Systems, p. 231-268, Springer: Cham, 2019.

  • [6] G. Andonovski, S. Blazic, I. Skrjanc “Evolving Fuzzy Model for Fault Detection and Fault Identification of Dynamic Processes.” In: Predictive Maintenance in Dynamic Systems, p. 269-285, Springer: Cham, 2019.
  • [7] Z. Hu, Y. Bodyanskiy, O. Tyshchenko, O. Boiko. “A Neuro-Fuzzy Kohonen Network for Data Stream Possibilistic Clustering and its Online Self-Learning Procedure.” Appl Soft Comput, 68, p. 710-718, 2018.
  • [8] S. Silva, P. Costa, M. Santana, D. Leite. “Evolving Neuro-Fuzzy Network for Real-Time High Impedance Fault Detection and Classification.” Neural Comput & Applic, 14p. DOI: 10.1007/s00521-018-3789-2, 2018.
  • [9] P. Souza, et al. “Evolving Fuzzy Neural Networks to Aid in the Construction of Systems Specialists in Cyber Attacks.” J Intell Fuzzy Syst, 36(6), p. 6743-6763, 2019.
  • [10] J. Kirkpatrick, et al. “Overcoming Catastrophic Forgetting in Neural Networks.” arXiv:1612.00796 [cs.LG], 13p. 2017.
  • [11] P. Angelov, D. Filev, N. Kasabov. Evolving Intelligent Systems: Methodology and Applications. Willey: New York, 2010.
  • [12] E. Lughofer, M. Sayed-Mouchaweh (Eds). Learning in Nonstationary Environments: Methods and Applications. Springer: New York, 2012.
  • [13] D. Leite. Evolving Granular Systems. PhD Thesis, State University of Campinas (UNICAMP), 2012.
  • [14] L. Cordovil, P. Coutinho, I. Bessa, M. D’Angelo, R. Palhares. “Uncertain Data Modeling Based on Evolving Ellipsoidal Fuzzy Information Granules.” IEEE T Fuzzy Syst, DOI:10.1109/TFUZZ.2019.2937052, 2019.
  • [15] P. C. L. Silva, H. J. Sadaei, R. Ballini, F. G. Guimarães. “Probabilistic Forecasting with Fuzzy Time Series.” IEEE Trans Fuzzy Syst, 14p. DOI: 10.1109/TFUZZ.2019.2922152, 2019.
  • [16] W. Pedrycz, F. Gomide. Fuzzy Systems Engineering: Toward Human-Centric Computing. Wiley: Hoboken - New Jersey, 2007.
  • [17] D. Leite, R. Ballini, P. Costa, F. Gomide. “Evolving Fuzzy Granular Modeling from Nonstationary Fuzzy Data Streams.” Evolving Systems, 3(2), p. 65-79, 2012.
  • [18] R. Hodrick, E. Prescott. “Postwar U.S. Business Cycles: An Empirical Investigation.” J of Money, Credit, and Banking, 29(1), p. 1-16, 1997.
  • [19] B. P. Lathi. Linear Systems and Signals. Oxford U. Press, 2nd ed. 2004.
  • [20] M. V. Reddy, R. Sodhi. “A Rule-based S-Transform and AdaBoost based Approach for Power Quality Assessment.” Electr Pow Syst Res, 134, p. 66-79, 2016.
  • [21]

    D. Leite, P. Costa, F. Gomide. “Evolving Granular Neural Networks from Fuzzy Data Streams.” Neural Netw, 38, p. 1-16, 2013.

  • [22] S. M. Stigler. A modest proposal: a new standard for the normal. The American Statistician, Vol. 36-2, JSTOR, 1982.
  • [23] D. Leite, G. Andonovski, I. Skrjanc, F. Gomide. “Optimal Rule-based Granular Systems from Data Streams.” IEEE T Fuzzy Syst, 28(3), p. 583-596, 2020.
  • [24] X. Wang, W. Pedrycz, A. Gacek, X. Liu. “From Numeric Data to Information Granules: A Design Through Clustering and the Principle of Justifiable Granularity.” Knowl-Based Syst, 101, p. 100-113, 2016.
  • [25] R. Yager. “Measures of Specificity over Continuous Spaces under Similarity Relations.” Fuzzy Set Syst, 159, p. 2193-2210, 2008.
  • [26] IEEE Recommended Practice for Monitoring Electric Power Quality. IEEE Power & Energy Society. IEEE Std 1159-2009, 2009.
  • [27] D. O. Johnson, K. A. Hassan. “Issues of Power Quality in Electrical Systems.” Int J Energy Power Eng, 5(4), p. 148-154, 2016.
  • [28] IEEE Recommended Practice and Requirements for Harmonic Control in Electric Power Systems. IEEE TDC, IEEE Std 519-2014, 2014.