Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

09/28/2022
by   Youness Moukafih, et al.
0

Recently, Supervised Contrastive Learning (SCL) has been shown to achieve excellent performance in most classification tasks. In SCL, a neural network is trained to optimize two objectives: pull an anchor and positive samples together in the embedding space, and push the anchor apart from the negatives. However, these two different objectives may conflict, requiring trade-offs between them during optimization. In this work, we formulate the SCL problem as a Multi-Objective Optimization problem for the fine-tuning phase of RoBERTa language model. Two methods are utilized to solve the optimization problem: (i) the linear scalarization (LS) method, which minimizes a weighted linear combination of pertask losses; and (ii) the Exact Pareto Optimal (EPO) method which finds the intersection of the Pareto front with a given preference vector. We evaluate our approach on several GLUE benchmark tasks, without using data augmentations, memory banks, or generating adversarial examples. The empirical results show that the proposed learning strategy significantly outperforms a strong competitive contrastive learning baseline

READ FULL TEXT

page 7

page 8

research
08/04/2023

Optimization on Pareto sets: On a theory of multi-objective optimization

In multi-objective optimization, a single decision vector must balance t...
research
02/08/2021

Multi-Objective Learning to Predict Pareto Fronts Using Hypervolume Maximization

Real-world problems are often multi-objective with decision-makers unabl...
research
09/20/2021

Differentially Evolving Memory Ensembles: Pareto Optimization based on Computational Intelligence for Embedded Memories on a System Level

As the relative power, performance, and area (PPA) impact of embedded me...
research
05/30/2022

An Approach to Ordering Objectives and Pareto Efficient Solutions

Solutions to multi-objective optimization problems can generally not be ...
research
04/24/2023

Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity

In this work, we investigate an approach that relies on contrastive lear...
research
11/29/2021

SimCLAD: A Simple Framework for Contrastive Learning of Acronym Disambiguation

Acronym disambiguation means finding the correct meaning of an ambiguous...
research
07/19/2021

DeepCC: Bridging the Gap Between Congestion Control and Applications via Multi-Objective Optimization

The increasingly complicated and diverse applications have distinct netw...

Please sign up or login with your details

Forgot password? Click here to reset