Representation Learning Beyond Linear Prediction Functions

05/31/2021
by   Ziping Xu, et al.
0

Recent papers on the theory of representation learning has shown the importance of a quantity called diversity when generalizing from a set of source tasks to a target task. Most of these papers assume that the function mapping shared representations to predictions is linear, for both source and target tasks. In practice, researchers in deep learning use different numbers of extra layers following the pretrained model based on the difficulty of the new task. This motivates us to ask whether diversity can be achieved when source tasks and the target task use different prediction function spaces beyond linear functions. We show that diversity holds even if the target task uses a neural network with multiple layers, as long as source tasks use linear functions. If source tasks use nonlinear prediction functions, we provide a negative result by showing that depth-1 neural networks with ReLu activation function need exponentially many source tasks to achieve diversity. For a general function class, we find that eluder dimension gives a lower bound on the number of tasks required for diversity. Our theoretical results imply that simpler tasks generalize better. Though our theoretical results are shown for the global minimizer of empirical risks, their qualitative predictions still hold true for gradient-based optimization algorithms as verified by our simulations on deep neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2019

Over-parametrized deep neural networks do not generalize well

Recently it was shown in several papers that backpropagation is able to ...
research
02/21/2020

Few-Shot Learning via Learning the Representation, Provably

This paper studies few-shot learning via representation learning, where ...
research
06/15/2023

Active Representation Learning for General Task Space with Applications in Robotics

Representation learning based on multi-task pretraining has become a pow...
research
07/23/2021

Using a Cross-Task Grid of Linear Probes to Interpret CNN Model Predictions On Retinal Images

We analyze a dataset of retinal images using linear probes: linear regre...
research
06/20/2020

On the Theory of Transfer Learning: The Importance of Task Diversity

We provide new statistical guarantees for transfer learning via represen...
research
06/07/2020

Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

We prove sharp dimension-free representation results for neural networks...
research
02/07/2022

MAML and ANIL Provably Learn Representations

Recent empirical evidence has driven conventional wisdom to believe that...

Please sign up or login with your details

Forgot password? Click here to reset