Fuzzy Jaccard Index: A robust comparison of ordered lists

08/05/2020
by   Matej Petković, et al.
7

We propose Fuzzy Jaccard Index (FUJI) – a scale-invariant score for assessment of the similarity between two ranked/ordered lists. FUJI improves upon the Jaccard index by incorporating a membership function which takes into account the particular ranks, thus producing both more stable and more accurate similarity estimates. We provide theoretical insights into the properties of the FUJI score as well as propose an efficient algorithm for computing it. We also present empirical evidence of its performance on different synthetic scenarios. Finally, we demonstrate its utility in a typical machine learning setting – comparing feature ranking lists relevant to a given machine learning task. In real-life, and in particular high-dimensional domains, where only a small percentage of the whole feature space might be relevant, a robust and confident feature ranking leads to interpretable findings as well as efficient computation and good predictive performance. In such cases, FUJI correctly distinguishes between existing feature ranking approaches, while being more robust and efficient than the benchmark similarity scores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2020

Ensemble- and Distance-Based Feature Ranking for Unsupervised Learning

In this work, we propose two novel (groups of) methods for unsupervised ...
research
06/21/2019

On Tree-based Methods for Similarity Learning

In many situations, the choice of an adequate similarity measure or metr...
research
11/05/2015

Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

We aim to produce predictive models that are not only accurate, but are ...
research
11/05/2018

How to aggregate Top-lists: Score based approximation schemes

We study the aggregation of partial rankings and give a PTAS for TOP-AGG...
research
12/01/2021

On the algebraic structures of the space of interval-valued intuitionistic fuzzy numbers

This study is inspired by those of Huang et al. (Soft Comput. 25, 2513–2...
research
11/06/2018

Computing Entity Semantic Similarity by Features Ranking

This article presents a novel approach to estimate semantic entity simil...
research
06/17/2020

Improvements in Computation and Usage of Joint CDFs for the N-Dimensional Order Statistic

Order statistics provide an intuition for combining multiple lists of sc...

Please sign up or login with your details

Forgot password? Click here to reset