Deep Reinforcement Learning for Dynamic Recommendation with Model-agnostic Counterfactual Policy Synthesis

08/10/2022
by   Siyu Wang, et al.
0

Recent advances in recommender systems have proved the potential of Reinforcement Learning (RL) to handle the dynamic evolution processes between users and recommender systems. However, learning to train an optimal RL agent is generally impractical with commonly sparse user feedback data in the context of recommender systems. To circumvent the lack of interaction of current RL-based recommender systems, we propose to learn a general Model-agnostic Counterfactual Synthesis Policy for counterfactual user interaction data augmentation. The counterfactual synthesis policy aims to synthesise counterfactual states while preserving significant information in the original state relevant to the user's interests, building upon two different training approaches we designed: learning with expert demonstrations and joint training. As a result, the synthesis of each counterfactual data is based on the current recommendation agent interaction with the environment to adapt to users' dynamic interests. We integrate the proposed policy Deep Deterministic Policy Gradient (DDPG), Soft Actor Critic (SAC) and Twin Delayed DDPG in an adaptive pipeline with a recommendation agent that can generate counterfactual data to improve the performance of recommendation. The empirical results on both online simulation and offline datasets demonstrate the effectiveness and generalisation of our counterfactual synthesis policy and verify that it improves the performance of RL recommendation agents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2022

Model-agnostic Counterfactual Synthesis Policy for Interactive Recommendation

Interactive recommendation is able to learn from the interactive process...
research
09/17/2022

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Deep reinforcement learning (DRL) has been proven its efficiency in capt...
research
12/04/2020

Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation

Reinforcement learning (RL) has shown great promise in optimizing long-t...
research
11/11/2020

Adaptive Neural Architectures for Recommender Systems

Deep learning has proved an effective means to capture the non-linear as...
research
09/06/2020

Information Theoretic Counterfactual Learning from Missing-Not-At-Random Feedback

Counterfactual learning for dealing with missing-not-at-random data (MNA...
research
01/30/2021

Deep Reinforcement Learning-Based Product Recommender for Online Advertising

In online advertising, recommender systems try to propose items from a l...
research
11/14/2018

Large-scale Interactive Recommendation with Tree-structured Policy Gradient

Reinforcement learning (RL) has recently been introduced to interactive ...

Please sign up or login with your details

Forgot password? Click here to reset