Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

10/15/2019
by   Kaixuan Zhang, et al.
11

Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2022

A Manifold Two-Sample Test Study: Integral Probability Metric with Neural Networks

Two-sample tests are important areas aiming to determine whether two col...
research
08/01/2018

Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models

Interpretation and diagnosis of machine learning models have gained rene...
research
07/09/2021

A Topological-Framework to Improve Analysis of Machine Learning Model Performance

As both machine learning models and the datasets on which they are evalu...
research
05/02/2023

Representation Learning via Manifold Flattening and Reconstruction

This work proposes an algorithm for explicitly constructing a pair of ne...
research
07/16/2003

Manifold Learning with Geodesic Minimal Spanning Trees

In the manifold learning problem one seeks to discover a smooth low dime...
research
11/15/2017

Influential Sample Selection: A Graph Signal Processing Approach

With the growing complexity of machine learning techniques, understandin...

Please sign up or login with your details

Forgot password? Click here to reset