LLaMA: Open and Efficient Foundation Language Models

02/27/2023
by   Hugo Touvron, et al.
6

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

READ FULL TEXT
research
08/24/2023

Code Llama: Open Foundation Models for Code

We release Code Llama, a family of large language models for code based ...
research
09/25/2021

Under the Skin of Foundation NFT Auctions

Non Fungible Tokens (NFTs) have gained a solid foothold within the crypt...
research
06/14/2022

Can Foundation Models Talk Causality?

Foundation models are subject to an ongoing heated debate, leaving open ...
research
06/09/2020

Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation

In this paper, we detail novel strategies for interpolating personalized...
research
06/26/2023

SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality

In the last year alone, a surge of new benchmarks to measure composition...
research
05/29/2023

Baselines for Identifying Watermarked Large Language Models

We consider the emerging problem of identifying the presence and use of ...
research
09/09/2023

Toward Reproducing Network Research Results Using Large Language Models

Reproducing research results in the networking community is important fo...

Please sign up or login with your details

Forgot password? Click here to reset