Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation

07/30/2019
by   Yang Gao, et al.
0

Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards. Learning input-specific RL policies is a more efficient alternative but so far depends on handcrafted rewards, which are difficult to design and yield poor performance. We propose RELIS, a novel RL paradigm that learns a reward function with Learning-to-Rank (L2R) algorithms at training time and uses this reward function to train an input-specific RL policy at test time. We prove that RELIS guarantees to generate near-optimal summaries with appropriate L2R and RL algorithms. Empirically, we evaluate our approach on extractive multi-document summarisation. We show that RELIS reduces the training time by two orders of magnitude compared to the state-of-the-art models while performing on par with them.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2019

Better Rewards Yield Better Summaries: Learning to Summarise Without References

Reinforcement Learning (RL) based document summarisation systems yield s...
research
03/16/2021

Learning to Shape Rewards using a Game of Switching Controls

Reward shaping (RS) is a powerful method in reinforcement learning (RL) ...
research
02/22/2021

Exploring Supervised and Unsupervised Rewards in Machine Translation

Reinforcement Learning (RL) is a powerful framework to address the discr...
research
06/10/2019

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards

Generating keyphrases that summarize the main points of a document is a ...
research
10/04/2019

Manufacturing Dispatching using Reinforcement and Transfer Learning

Efficient dispatching rule in manufacturing industry is key to ensure pr...
research
02/21/2020

Accelerating Reinforcement Learning with a Directional-Gaussian-Smoothing Evolution Strategy

Evolution strategy (ES) has been shown great promise in many challenging...
research
02/24/2016

Learning values across many orders of magnitude

Most learning algorithms are not invariant to the scale of the function ...

Please sign up or login with your details

Forgot password? Click here to reset