SGPT: GPT Sentence Embeddings for Semantic Search

02/17/2022
by   Niklas Muennighoff, et al.
0

GPT transformers are the largest language models available, yet semantic search is dominated by BERT transformers. We present SGPT-BE and SGPT-CE for applying GPT models as Bi-Encoders or Cross-Encoders to symmetric or asymmetric search. SGPT-BE produces semantically meaningful sentence embeddings by contrastive fine-tuning of only bias tensors and a novel pooling method. A 5.8 billion parameter SGPT-BE outperforms the best available sentence embeddings by 6 setting a new state-of-the-art on BEIR. It outperforms the concurrently proposed OpenAI Embeddings of the 175B Davinci endpoint, which fine-tunes 250,000 times more parameters. SGPT-CE uses log probabilities from GPT models without any fine-tuning. A 6.1 billion parameter SGPT-CE sets an unsupervised state-of-the-art on BEIR. It beats the supervised state-of-the-art on 7 datasets, but significantly loses on other datasets. We show how this can be alleviated by adapting the prompt. SGPT-BE and SGPT-CE performance scales with model size. Yet, increased latency, storage and compute costs should be considered. Code, models and result files are freely available at https://github.com/Muennighoff/sgpt.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2023

Scaling Sentence Embeddings with Large Language Models

Large language models (LLMs) have recently garnered significant interest...
research
03/14/2022

Deep Continuous Prompt for Contrastive Learning of Sentence Embeddings

The performance of sentence representation has been remarkably improved ...
research
08/16/2023

Boosting Commit Classification with Contrastive Learning

Commit Classification (CC) is an important task in software maintenance,...
research
09/22/2022

Efficient Few-Shot Learning Without Prompts

Recent few-shot methods, such as parameter-efficient fine-tuning (PEFT) ...
research
12/08/2022

Explain to me like I am five – Sentence Simplification Using Transformers

Sentence simplification aims at making the structure of text easier to r...
research
09/28/2021

Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders

Pretrained Masked Language Models (MLMs) have revolutionised NLP in rece...
research
05/10/2023

LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

Text embeddings are useful features for several NLP applications, such a...

Please sign up or login with your details

Forgot password? Click here to reset