Log In Sign Up

Learn to Talk via Proactive Knowledge Transfer

by   Qing Sun, et al.

Knowledge Transfer has been applied in solving a wide variety of problems. For example, knowledge can be transferred between tasks (e.g., learning to handle novel situations by leveraging prior knowledge) or between agents (e.g., learning from others without direct experience). Without loss of generality, we relate knowledge transfer to KL-divergence minimization, i.e., matching the (belief) distributions of learners and teachers. The equivalence gives us a new perspective in understanding variants of the KL-divergence by looking at how learners structure their interaction with teachers in order to acquire knowledge. In this paper, we provide an in-depth analysis of KL-divergence minimization in Forward and Backward orders, which shows that learners are reinforced via on-policy learning in Backward. In contrast, learners are supervised in Forward. Moreover, our analysis is gradient-based, so it can be generalized to arbitrary tasks and help to decide which order to minimize given the property of the task. By replacing Forward with Backward in Knowledge Distillation, we observed +0.7-1.1 BLEU gains on the WMT'17 De-En and IWSLT'15 Th-En machine translation tasks.


page 1

page 2

page 3

page 4


A Theory for Knowledge Transfer in Continual Learning

Continual learning of a stream of tasks is an active area in deep neural...

Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation

Knowledge distillation (KD), transferring knowledge from a cumbersome te...

On Knowledge Distillation for Direct Speech Translation

Direct speech translation (ST) has shown to be a complex task requiring ...

Empirical Evaluation of Biased Methods for Alpha Divergence Minimization

In this paper we empirically evaluate biased methods for alpha-divergenc...

Forward and Backward Knowledge Transfer for Sentiment Classification

This paper studies the problem of learning a sequence of sentiment class...

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Approximate Policy Iteration (API) algorithms alternate between (approxi...