Mapper Comparison with Wasserstein Metrics
The challenge of describing model drift is an open question in unsupervised learning. It can be difficult to evaluate at what point an unsupervised model has deviated beyond what would be expected from a different sample from the same population. This is particularly true for models without a probabilistic interpretation. One such family of techniques, Topological Data Analysis, and the Mapper algorithm in particular, has found use in a variety of fields, but describing model drift for Mapper graphs is an understudied area as even existing techniques for measuring distances between related constructs like graphs or simplicial complexes fail to account for the fact that Mapper graphs represent a combination of topological, metric, and density information. In this paper, we develop an optimal transport based metric which we call the Network Augmented Wasserstein Distance for evaluating distances between Mapper graphs and demonstrate the value of the metric for model drift analysis by using the metric to transform the model drift problem into an anomaly detection problem over dynamic graphs.
READ FULL TEXT