Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices

06/29/2022
by   Amit Chaulwar, et al.
0

Modern search systems use several large ranker models with transformer architectures. These models require large computational resources and are not suitable for usage on devices with limited computational resources. Knowledge distillation is a popular compression technique that can reduce the resource needs of such models, where a large teacher model transfers knowledge to a small student model. To drastically reduce memory requirements and energy consumption, we propose two extensions for a popular sentence-transformer distillation procedure: generation of an optimal size vocabulary and dimensionality reduction of the embedding dimension of teachers prior to distillation. We evaluate these extensions on two different types of ranker models. This results in extremely compressed student models whose analysis on a test dataset shows the significance and utility of our proposed extensions.

READ FULL TEXT
research
12/10/2021

DisCo: Effective Knowledge Distillation For Contrastive Learning of Sentence Embeddings

Contrastive learning has been proven suitable for learning sentence embe...
research
05/04/2022

Knowledge Distillation of Russian Language Models with Reduction of Vocabulary

Today, transformer language models serve as a core component for majorit...
research
07/03/2021

Efficient Vision Transformers via Fine-Grained Manifold Distillation

This paper studies the model compression problem of vision transformers....
research
01/08/2022

Two-Pass End-to-End ASR Model Compression

Speech recognition on smart devices is challenging owing to the small me...
research
11/08/2019

Deep geometric knowledge distillation with graphs

In most cases deep learning architectures are trained disregarding the a...
research
10/20/2020

BERT2DNN: BERT Distillation with Massive Unlabeled Data for Online E-Commerce Search

Relevance has significant impact on user experience and business profit ...
research
04/23/2022

On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation

Modern recommender systems operate in a fully server-based fashion. To c...

Please sign up or login with your details

Forgot password? Click here to reset