Using an expert deviation carrying the knowledge of climate data in usual clustering algorithms

06/10/2020
by   Emmanuel Biabiany, et al.
0

In order to help physicists to expand their knowledge of the climate in the Lesser Antilles, we aim to identify the spatio-temporal configurations using clustering analysis on wind speed and cumulative rainfall datasets. But we show that using the L2 norm in conventional clustering methods as K-Means (KMS) and Hierarchical Agglomerative Clustering (HAC) can induce undesirable effects. So, we propose to replace Euclidean distance (L2) by a dissimilarity measure named Expert Deviation (ED). Based on the symmetrized Kullback-Leibler divergence, the ED integrates the properties of the observed physical parameters and climate knowledge. This measure helps comparing histograms of four patches, corresponding to geographical zones, that are influenced by atmospheric structures. The combined evaluation of the internal homogeneity and the separation of the clusters obtained using ED and L2 was performed. The results, which are compared using the silhouette index, show five clusters with high indexes. For the two available datasets one can see that, unlike KMS-L2, KMS-ED discriminates the daily situations favorably, giving more physical meaning to the clusters discovered by the algorithm. The effect of patches is observed in the spatial analysis of representative elements for KMS-ED. The ED is able to produce different configurations which makes the usual atmospheric structures clearly identifiable. Atmospheric physicists can interpret the locations of the impact of each cluster on a specific zone according to atmospheric structures. KMS-L2 does not lead to such an interpretability, because the situations represented are spatially quite smooth. This climatological study illustrates the advantage of using ED as a new approach.

READ FULL TEXT
research
12/20/2018

Cluster validity index based on Jeffrey divergence

Cluster validity indexes are very important tools designed for two purpo...
research
02/05/2020

Comparing clusterings and numbers of clusters by aggregation of calibrated clustering validity indexes

A key issue in cluster analysis is the choice of an appropriate clusteri...
research
05/11/2021

An internal validity index based on density-involved distance

It is crucial to evaluate the quality of clustering results in cluster a...
research
04/18/2019

Modelling antimicrobial prescriptions in Scotland: A spatio-temporal clustering approach

In 2016 the British government acknowledged the importance of reducing a...
research
07/02/2023

Spatiotemporal Cluster Analysis of Gridded Temperature Data – A Comparison Between K-means and MiSTIC

The Earth is a system of numerous interconnected spheres, such as the cl...
research
12/14/2020

Clustering high dimensional meteorological scenarios: results and performance index

The Reseau de Transport d'Electricité (RTE) is the French main electrici...

Please sign up or login with your details

Forgot password? Click here to reset