Building Accurate Simple Models with Multihop

09/14/2021
by   Amit Dhurandhar, et al.
0

Knowledge transfer from a complex high performing model to a simpler and potentially low performing one in order to enhance its performance has been of great interest over the last few years as it finds applications in important problems such as explainable artificial intelligence, model compression, robust model building and learning from small data. Known approaches to this problem (viz. Knowledge Distillation, Model compression, ProfWeight, etc.) typically transfer information directly (i.e. in a single/one hop) from the complex model to the chosen simple model through schemes that modify the target or reweight training examples on which the simple model is trained. In this paper, we propose a meta-approach where we transfer information from the complex model to the simple model by dynamically selecting and/or constructing a sequence of intermediate models of decreasing complexity that are less intricate than the original complex model. Our approach can transfer information between consecutive models in the sequence using any of the previously mentioned approaches as well as work in 1-hop fashion, thus generalizing these approaches. In the experiments on real data, we observe that we get consistent gains for different choices of models over 1-hop, which on average is more than 2% and reaches up to 8% in a particular case. We also empirically analyze conditions under which the multi-hop approach is likely to be beneficial over the traditional 1-hop approach, and report other interesting insights. To the best of our knowledge, this is the first work that proposes such a multi-hop approach to perform knowledge transfer given a single high performing complex model, making it in our opinion, an important methodological contribution.

READ FULL TEXT
research
06/17/2019

Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA

Multi-hop question answering requires a model to connect multiple pieces...
research
09/01/2020

Classification of Diabetic Retinopathy Using Unlabeled Data and Knowledge Distillation

Knowledge distillation allows transferring knowledge from a pre-trained ...
research
09/11/2023

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

Answering multi-hop reasoning questions requires retrieving and synthesi...
research
05/22/2020

A Complex KBQA System using Multiple Reasoning Paths

Multi-hop knowledge based question answering (KBQA) is a complex task fo...
research
09/06/2021

An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models

Prompt-based knowledge probing for 1-hop relations has been used to meas...
research
05/23/2023

HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision

Explainable multi-hop question answering (QA) not only predicts answers ...

Please sign up or login with your details

Forgot password? Click here to reset