A Quantitative Perspective on Values of Domain Knowledge for Machine Learning

11/17/2020
by   Jianyi Yang, et al.
25

With the exploding popularity of machine learning, domain knowledge in various forms has been playing a crucial role in improving the learning performance, especially when training data is limited. Nonetheless, there is little understanding of to what extent domain knowledge can affect a machine learning task from a quantitative perspective. To increase the transparency and rigorously explain the role of domain knowledge in machine learning, we study the problem of quantifying the values of domain knowledge in terms of its contribution to the learning performance in the context of informed machine learning. We propose a quantification method based on Shapley value that fairly attributes the overall learning performance improvement to different domain knowledge. We also present Monte-Carlo sampling to approximate the fair value of domain knowledge with a polynomial time complexity. We run experiments of injecting symbolic domain knowledge into semi-supervised learning tasks on both MNIST and CIFAR10 datasets, providing quantitative values of different symbolic knowledge and rigorously explaining how it affects the machine learning performance in terms of test accuracy.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset