Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective
Obtaining rigorous statistical guarantees for generalization under distribution shift remains an open and active research area. We study a setting we call combinatorial distribution shift, where (a) under the test- and training-distributions, the labels z are determined by pairs of features (x,y), (b) the training distribution has coverage of certain marginal distributions over x and y separately, but (c) the test distribution involves examples from a product distribution over (x,y) that is not covered by the training distribution. Focusing on the special case where the labels are given by bilinear embeddings into a Hilbert space H: 𝔼[z | x,y ]=⟨ f_⋆(x),g_⋆(y)⟩_H, we aim to extrapolate to a test distribution domain that is not covered in training, i.e., achieving bilinear combinatorial extrapolation. Our setting generalizes a special case of matrix completion from missing-not-at-random data, for which all existing results require the ground-truth matrices to be either exactly low-rank, or to exhibit very sharp spectral cutoffs. In this work, we develop a series of theoretical results that enable bilinear combinatorial extrapolation under gradual spectral decay as observed in typical high-dimensional data, including novel algorithms, generalization guarantees, and linear-algebraic results. A key tool is a novel perturbation bound for the rank-k singular value decomposition approximations between two matrices that depends on the relative spectral gap rather than the absolute spectral gap, a result that may be of broader independent interest.
READ FULL TEXT