Tangent Transformers for Composition, Privacy and Removal

07/16/2023
by   Tian Yu Liu, et al.
0

We introduce Tangent Attention Fine-Tuning (TAFT), a method for fine-tuning linearized transformers obtained by computing a First-order Taylor Expansion around a pre-trained initialization. We show that the Jacobian-Vector Product resulting from linearization can be computed efficiently in a single forward pass, reducing training and inference cost to the same order of magnitude as its original non-linear counterpart, while using the same number of parameters. Furthermore, we show that, when applied to various downstream visual classification tasks, the resulting Tangent Transformer fine-tuned with TAFT can perform comparably with fine-tuning the original non-linear network. Since Tangent Transformers are linear with respect to the new set of weights, and the resulting fine-tuning loss is convex, we show that TAFT enjoys several advantages compared to non-linear fine-tuning when it comes to model composition, parallel training, machine unlearning, and differential privacy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2022

Scaling Shifting Your Features: A New Baseline for Efficient Model Tuning

Existing fine-tuning methods either tune all parameters of the pre-train...
research
07/31/2023

Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information?

Machine learning practitioners often fine-tune generative pre-trained mo...
research
08/31/2021

T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP

Transformers are the dominant architecture in NLP, but their training an...
research
12/21/2020

LQF: Linear Quadratic Fine-Tuning

Classifiers that are linear in their parameters, and trained by optimizi...
research
07/16/2023

Tangent Model Composition for Ensembling and Continual Fine-tuning

Tangent Model Composition (TMC) is a method to combine component models ...
research
08/11/2023

Experts Weights Averaging: A New General Training Scheme for Vision Transformers

Structural re-parameterization is a general training scheme for Convolut...
research
08/28/2023

SAM-PARSER: Fine-tuning SAM Efficiently by Parameter Space Reconstruction

Segment Anything Model (SAM) has received remarkable attention as it off...

Please sign up or login with your details

Forgot password? Click here to reset