Rank Energy Statistics in the Context of Change Point Detection
In this paper, I propose a general procedure for multivariate distribution-free nonparametric testing derived from the concept of ranks that are based upon measure transportation in the context of multiple change point analysis. I will use this algorithm to estimate both the number of change points and their locations within an observed multivariate time series. In this paper, the change point problem is observed in a general setting in which both the given distribution and number of change points are unknown, rather than assume the observed time series follows a specific distribution or contains only one change point as many works in this area of study assume. The intention of this is to develop a technique for accurately identifying the changes in a distribution while making as few suppositions as possible. The rank energy statistic used here is based on energy statistics and has the potential to detect any change in a distribution. I present the properties of this new algorithm, which can be used to analyze various datasets, including hierarchical clustering, testing multivariate normality, gene selection, and microarray data analysis. This algorithm has also been implemented in the R package recp, which is available on CRAN.
READ FULL TEXT