Interpreting Distortions in Dimensionality Reduction by Superimposing Neighbourhood Graphs
To perform visual data exploration, many dimensionality reduction methods have been developed. These tools allow data analysts to represent multidimensional data in a 2D or 3D space, while preserving as much relevant information as possible. Yet, they cannot preserve all structures simultaneously and they induce some unavoidable distortions. Hence, many criteria have been introduced to evaluate a map's overall quality, mostly based on the preservation of neighbourhoods. Such global indicators are currently used to compare several maps, which helps to choose the most appropriate mapping method and its hyperparameters. However, those aggregated indicators tend to hide the local repartition of distortions. Thereby, they need to be supplemented by local evaluation to ensure correct interpretation of maps. In this paper, we describe a new method, called MING, for `Map Interpretation using Neighbourhood Graphs'. It offers a graphical interpretation of pairs of map quality indicators, as well as local evaluation of the distortions. This is done by displaying on the map the nearest neighbours graphs computed in the data space and in the embedding. Shared and unshared edges exhibit reliable and unreliable neighbourhood information conveyed by the mapping. By this mean, analysts may determine whether proximity (or remoteness) of points on the map faithfully represents similarity (or dissimilarity) of original data, within the meaning of a chosen map quality criteria. We apply this approach to two pairs of widespread indicators: precision/recall and trustworthiness/continuity, chosen for their wide use in the community, which will allow an easy handling by users.
READ FULL TEXT 
  
  
     share
 share