GPT-4 Technical Report

03/15/2023
by   OpenAI, et al.
0

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10 Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.

READ FULL TEXT

page 8

page 9

page 11

page 12

page 35

page 37

research
05/08/2021

Long-Span Dependencies in Transformer-based Summarization Systems

Transformer-based models have achieved state-of-the-art results in a wid...
research
05/26/2023

Generating Images with Multimodal Language Models

We propose a method to fuse frozen text-only large language models (LLMs...
research
09/20/2023

Kosmos-2.5: A Multimodal Literate Model

We present Kosmos-2.5, a multimodal literate model for machine reading o...
research
05/28/2021

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Most widely-used pre-trained language models operate on sequences of tok...
research
06/21/2023

OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

Large multimodal models trained on natural documents, which interleave i...
research
07/29/2021

UIBert: Learning Generic Multimodal Representations for UI Understanding

To improve the accessibility of smart devices and to simplify their usag...

Please sign up or login with your details

Forgot password? Click here to reset