Rethinking Learning Rate Tuning in the Era of Large Language Models

09/16/2023
by   Hongpeng Jin, et al.
0

Large Language Models (LLMs) represent the recent success of deep learning in achieving remarkable human-like predictive performance. It has become a mainstream strategy to leverage fine-tuning to adapt LLMs for various real-world applications due to the prohibitive expenses associated with LLM training. The learning rate is one of the most important hyperparameters in LLM fine-tuning with direct impacts on both fine-tuning efficiency and fine-tuned LLM quality. Existing learning rate policies are primarily designed for training traditional deep neural networks (DNNs), which may not work well for LLM fine-tuning. We reassess the research challenges and opportunities of learning rate tuning in the coming era of Large Language Models. This paper makes three original contributions. First, we revisit existing learning rate policies to analyze the critical challenges of learning rate tuning in the era of LLMs. Second, we present LRBench++ to benchmark learning rate policies and facilitate learning rate tuning for both traditional DNNs and LLMs. Third, our experimental analysis with LRBench++ demonstrates the key differences between LLM fine-tuning and traditional DNN training and validates our analysis.

READ FULL TEXT
research
02/05/2022

Adaptive Fine-Tuning of Transformer-Based Language Models for Named Entity Recognition

The current standard approach for fine-tuning transformer-based language...
research
08/18/2019

Demystifying Learning Rate Polices for High Accuracy Training of Deep Neural Networks

Learning Rate (LR) is an important hyper-parameter to tune for effective...
research
07/01/2021

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

There have been long-standing controversies and inconsistencies over the...
research
09/22/2017

Improving Language Modelling with Noise-contrastive estimation

Neural language models do not scale well when the vocabulary is large. N...
research
09/20/2021

Reproducibility Study: Comparing Rewinding and Fine-tuning in Neural Network Pruning

Scope of reproducibility: We are reproducing Comparing Rewinding and Fin...
research
10/24/2022

Selecting and Composing Learning Rate Policies for Deep Neural Networks

The choice of learning rate (LR) functions and policies has evolved from...
research
11/03/2022

Fine-Tuning Language Models via Epistemic Neural Networks

Large language models are now part of a powerful new paradigm in machine...

Please sign up or login with your details

Forgot password? Click here to reset