An Online Algorithm for Nonparametric Correlations
Nonparametric correlations such as Spearman's rank correlation and Kendall's tau correlation are widely applied in scientific and engineering fields. This paper investigates the problem of computing nonparametric correlations on the fly for streaming data. Standard batch algorithms are generally too slow to handle real-world big data applications. They also require too much memory because all the data need to be stored in the memory before processing. This paper proposes a novel online algorithm for computing nonparametric correlations. The algorithm has O(1) time complexity and O(1) memory cost and is quite suitable for edge devices, where only limited memory and processing power are available. You can seek a balance between speed and accuracy by changing the number of cutpoints specified in the algorithm. The online algorithm can compute the nonparametric correlations 10 to 1,000 times faster than the corresponding batch algorithm, and it can compute them based either on all past observations or on fixed-size sliding windows.
READ FULL TEXT