A Comprehensive Overview of Large Language Models

07/12/2023
by   Humza Naveed, et al.
0

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

READ FULL TEXT

page 1

page 2

page 14

page 15

research
06/16/2022

Methods for Estimating and Improving Robustness of Language Models

Despite their outstanding performance, large language models (LLMs) suff...
research
12/31/2022

A Survey for In-context Learning

With the increasing ability of large language models (LLMs), in-context ...
research
05/25/2023

Training Data Extraction From Pre-trained Language Models: A Survey

As the deployment of pre-trained language models (PLMs) expands, pressin...
research
10/16/2021

Sharpness-Aware Minimization Improves Language Model Generalization

The allure of superhuman-level capabilities has led to considerable inte...
research
12/17/2019

Analyzing Privacy Loss in Updates of Natural Language Models

To continuously improve quality and reflect changes in data, machine lea...
research
08/28/2023

A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks

The Large Language Models (LLMs) are poised to offer efficient and intel...
research
05/29/2020

Beyond Leaderboards: A survey of methods for revealing weaknesses in Natural Language Inference data and models

Recent years have seen a growing number of publications that analyse Nat...

Please sign up or login with your details

Forgot password? Click here to reset