Log In Sign Up

Distinguishing rule- and exemplar-based generalization in learning systems

by   Ishita Dasgupta, et al.

Despite the increasing scale of datasets in machine learning, generalization to unseen regions of the data distribution remains crucial. Such extrapolation is by definition underdetermined and is dictated by a learner's inductive biases. Machine learning systems often do not share the same inductive biases as humans and, as a result, extrapolate in ways that are inconsistent with our expectations. We investigate two distinct such inductive biases: feature-level bias (differences in which features are more readily learned) and exemplar-vs-rule bias (differences in how these learned features are used for generalization). Exemplar- vs. rule-based generalization has been studied extensively in cognitive psychology, and, in this work, we present a protocol inspired by these experimental approaches for directly probing this trade-off in learning systems. The measures we propose characterize changes in extrapolation behavior when feature coverage is manipulated in a combinatorial setting. We present empirical results across a range of models and across both expository and real-world image and language domains. We demonstrate that measuring the exemplar-rule trade-off while controlling for feature-level bias provides a more complete picture of extrapolation behavior than existing formalisms. We find that most standard neural network models have a propensity towards exemplar-based extrapolation and discuss the implications of these findings for research on data augmentation, fairness, and systematic generalization.


page 1

page 2

page 3

page 4


A review of possible effects of cognitive biases on interpretation of rule-based machine learning models

This paper investigates to what extent do cognitive biases affect human ...

Deconstructing the Inductive Biases of Hamiltonian Neural Networks

Physics-inspired neural networks (NNs), such as Hamiltonian or Lagrangia...

Transformers generalize differently from information stored in context vs in weights

Transformer models can use two fundamentally different kinds of informat...

InBiaseD: Inductive Bias Distillation to Improve Generalization and Robustness through Shape-awareness

Humans rely less on spurious correlations and trivial cues, such as text...

Learning Modular Structures That Generalize Out-of-Distribution

Out-of-distribution (O.O.D.) generalization remains to be a key challeng...

Mutual exclusivity as a challenge for neural networks

Strong inductive biases allow children to learn in fast and adaptable wa...

Target Languages (vs. Inductive Biases) for Learning to Act and Plan

Recent breakthroughs in AI have shown the remarkable power of deep learn...