A Topological Approach to Measuring Training Data Quality

06/04/2023
by   Álvaro Torras-Casas, et al.
0

Data quality is crucial for the successful training, generalization and performance of artificial intelligence models. Furthermore, it is known that the leading approaches in artificial intelligence are notoriously data-hungry. In this paper, we propose the use of small training datasets towards faster training. Specifically, we provide a novel topological method based on morphisms between persistence modules to measure the training data quality with respect to the complete dataset. This way, we can provide an explanation of why the chosen training dataset will lead to poor performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset