Side-Tuning: Network Adaptation via Additive Side Networks

12/31/2019
by   Jeffrey O Zhang, et al.
15

When training a neural network for a desired task, one may prefer to adapt a pre-trained network rather than start with a randomly initialized one – due to lacking enough training data, performing lifelong learning where the system has to learn a new task while being previously trained for other tasks, or wishing to encode priors in the network via preset weights. The most commonly employed approaches for network adaptation are fine-tuning and using the pre-trained network as a fixed feature extractor, among others. In this paper, we propose a straightforward alternative: Side-Tuning. Side-tuning adapts a pre-trained network by training a lightweight "side" network that is fused with the (unchanged) pre-trained network using summation. This simple method works as well as or better than existing solutions while it resolves some of the basic issues with fine-tuning, fixed features, and several other common baselines. In particular, side-tuning is less prone to overfitting when little training data is available, yields better results than using a fixed feature extractor, and does not suffer from catastrophic forgetting in lifelong learning. We demonstrate the performance of side-tuning under a diverse set of scenarios, including lifelong learning (iCIFAR, Taskonomy), reinforcement learning, imitation learning (visual navigation in Habitat), NLP question-answering (SQuAD v2), and single-task transfer learning (Taskonomy), with consistently promising results.

READ FULL TEXT
research
11/28/2017

Gradual Tuning: a better way of Fine Tuning the parameters of a Deep Neural Network

In this paper we present an alternative strategy for fine-tuning the par...
research
09/29/2021

Targeted Gradient Descent: A Novel Method for Convolutional Neural Networks Fine-tuning and Online-learning

A convolutional neural network (ConvNet) is usually trained and then tes...
research
06/16/2020

Deep Multimodal Transfer-Learned Regression in Data-Poor Domains

In many real-world applications of deep learning, estimation of a target...
research
02/05/2018

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

In inductive transfer learning, fine-tuning pre-trained convolutional ne...
research
08/27/2020

A Flexible Selection Scheme for Minimum-Effort Transfer Learning

Fine-tuning is a popular way of exploiting knowledge contained in a pre-...
research
05/30/2023

AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning

Entity Matching (EM) involves identifying different data representations...
research
06/29/2016

Learning without Forgetting

When building a unified vision system or gradually adding new capabiliti...

Please sign up or login with your details

Forgot password? Click here to reset