Distilled Neural Networks for Efficient Learning to Rank

02/22/2022
by   F. M. Nardini, et al.
0

Recent studies in Learning to Rank have shown the possibility to effectively distill a neural network from an ensemble of regression trees. This result leads neural networks to become a natural competitor of tree-based ensembles on the ranking task. Nevertheless, ensembles of regression trees outperform neural models both in terms of efficiency and effectiveness, particularly when scoring on CPU. In this paper, we propose an approach for speeding up neural scoring time by applying a combination of Distillation, Pruning and Fast Matrix multiplication. We employ knowledge distillation to learn shallow neural networks from an ensemble of regression trees. Then, we exploit an efficiency-oriented pruning technique that performs a sparsification of the most computationally-intensive layers of the neural network that is then scored with optimized sparse matrix multiplication. Moreover, by studying both dense and sparse high performance matrix multiplication, we develop a scoring time prediction model which helps in devising neural network architectures that match the desired efficiency requirements. Comprehensive experiments on two public learning-to-rank datasets show that neural networks produced with our novel approach are competitive at any point of the effectiveness-efficiency trade-off when compared with tree-based ensembles, providing up to 4x scoring time speed-up without affecting the ranking quality.

READ FULL TEXT

page 8

page 18

research
06/15/2023

Neural Network Compression using Binarization and Few Full-Precision Weights

Quantization and pruning are known to be two effective Deep Neural Netwo...
research
10/05/2017

Tuning Technique for Multiple Precision Dense Matrix Multiplication using Prediction of Computational Time

Although reliable long precision floating-point arithmetic libraries suc...
research
02/16/2021

Speeding Up Private Distributed Matrix Multiplication via Bivariate Polynomial Codes

We consider the problem of private distributed matrix multiplication und...
research
06/25/2020

Constant-Depth and Subcubic-Size Threshold Circuits for Matrix Multiplication

Boolean circuits of McCulloch-Pitts threshold gates are a classic model ...
research
09/15/2023

Unveiling Invariances via Neural Network Pruning

Invariance describes transformations that do not alter data's underlying...
research
05/06/2021

Learning Early Exit Strategies for Additive Ranking Ensembles

Modern search engine ranking pipelines are commonly based on large machi...
research
09/12/2023

Ensemble Mask Networks

Can an ℝ^n→ℝ^n feedforward network learn matrix-vector multiplication? T...

Please sign up or login with your details

Forgot password? Click here to reset