A General Offline Reinforcement Learning Framework for Interactive Recommendation

02/02/2021

A General Offline Reinforcement Learning Framework for Interactive Recommendation

Teng Xiao, Donglin Wang

Keywords:

Abstract Paper Similar Papers

Abstract: This paper studies the problem of learning interactive recommender systems from logged feedbacks without any exploration in online environments. We address the problem by proposing a general offline reinforcement learning framework for recommendation, which enables maximizing cumulative user rewards without online exploration. Specifically, we first introduce a probabilistic generative model for interactive recommendation, and then propose an effective inference algorithm for discrete and stochastic policy learning based on logged feedbacks. In order to perform offline learning more effectively, we propose five approaches to minimize the distribution mismatch between the logging policy and recommendation policy: support constraints, supervised regularization, policy constraints, dual constraints and reward extrapolation. We conduct extensive experiments on two public real-world datasets, demonstrating that the proposed methods can achieve superior performance over existing supervised learning and reinforcement learning methods for recommendation.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38949144

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

Angelos Filos, Clare Lyle, Yarin Gal and
Sergey Levine, Natasha Jaques, Gregory Farquhar

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:18

26/08/2020

Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions

Giorgia Ramponi, Amarildo Likmeta, Alberto Maria Metelli and
Andrea Tirinzoni, Marcello Restelli

Keywords Paper

0

0

0

0

9:41

25/07/2020

A deep recurrent survival model for unbiased ranking

Jiarui Jin, Yuchen Fang, Weinan Zhang and
Kan Ren, Guorui Zhou, Jian Xu, Yong Yu, Jun Wang, Xiaoqiang Zhu, Kun Gai

Keywords Paper

cascade model, unbiased learning-to-rank, position bias

0

0

0

0

11:32

25/07/2020

Learning to transfer graph embeddings for inductive graph based recommendation

Le Wu, Yonghui Yang, Lei Chen and
Defu Lian, Richang Hong, Meng Wang

Keywords Paper

graph neural network, content based recommendation, inductive graph learning

0

0

0

0

15:15

06/12/2021

Online Selective Classification with Limited Feedback

Aditya Gangrade, Anil Kag, Ashok Cutkosky, Venkatesh Saligrama

Keywords Paper

machine learning, online learning

0

0

0

0

15:14

02/02/2021

Graph Heterogeneous Multi-Relational Recommendation

Chong Chen, Weizhi Ma, Min Zhang and
Zhaowei Wang, Xiuqiang He, Chenyang Wang, Yiqun Liu, Shaoping Ma

Keywords Paper

0

0

0

0

13:49

14/09/2020

An algorithmic framework for decentralised matrix factorisation

Erika Duriakova, Weipeng Huang, Elias Tragos and
Aonghus Lawlor, Barry Smyth, James Geraci, Neil Hurley

Keywords Paper

recommender systems, distributed learning, decentralised matrix factorisation, latent factor models, matrix factorisation, communication efficiency, convergence proof

0

0

0

1

13:30

23/08/2020

Joint policy-value learning for recommendation

Olivier Jeunen, David Rohde, Flavian Vasile, Martin Bompaire

Keywords Paper

bandit feedback, counterfactual learning, policy learning

0

0

0

0

12:15

22/09/2020

Exploring clustering of bandits for online recommendation system

Liu Yang, Bo Liu, Leyu Lin and
Feng Xia, Kai Chen, Qiang Yang

Keywords Paper

online learning, cluster-of-bandit, recommendation system

0

0

0

0

2:57

02/02/2021

Learning from eXtreme Bandit Feedback

Romain Lopez, Inderjit S. Dhillon, Michael I. Jordan

Keywords Paper

0

0

0

0

19:29

22/09/2020

Long-tail session-based recommendation

Siyi Liu, Yujia Zheng

Keywords Paper

Neural network, Session-based recommendation, Long-tail recommendation

0

0

0

0

2:19

18/07/2021

Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning

Taehyeong Kim, Injune Hwang, Hyundo Lee and
Hyunseo Kim, Won-Seok Choi, Joseph Lim, Byoung-Tak Zhang

Keywords Paper

Algorithms, Active Learning

0

0

0

0

4:53

23/08/2020

Dual channel hypergraph collaborative filtering

Shuyi Ji, Yifan Feng, Rongrong Ji and
Xibin Zhao, Wanwan Tang, Yue Gao

Keywords Paper

dual channel, hypergraph, collaborative filtering

0

0

0

0

13:18

18/07/2021

Learning Online Algorithms with Distributional Advice

Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos and
Ali Vakilian, Nikos Zarifis

Keywords Paper

Algorithms

0

0

0

0

5:45

18/07/2021

Towards Open-World Recommendation: An Inductive Model-based Collaborative Filtering Approach

Qitian Wu, Hengrui Zhang, Xiaofeng Gao and
Junchi Yan, Hongyuan Zha

Keywords Paper

Applications, Recommender Systems

0

0

0

0

5:08

06/12/2020

Adversarial Counterfactual Learning and Evaluation for Recommender System

Da Xu, Chuanwei Ruan, Evren Korpeoglu and
Sushant Kumar, Kannan Achan

Keywords Paper

0

0

0

0

3:10

02/06/2020

Entity Summarization with User Feedback

Qingxia Liu, Yue Chen, Gong Cheng and
Evgeny Kharlamov, Junyou Li, Yuzhong Qu

Keywords Paper

0

0

0

0

21:30

06/12/2021

Active Offline Policy Selection

Ksenia Konyushova, Yutian Chen, Thomas Paine and
Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas

Keywords Paper

optimization, reinforcement learning and planning, active learning

1

0

0

0

12:46

25/07/2020

GAG: Global attributed graph neural network for streaming session-based recommendation

Ruihong Qiu, Hongzhi Yin, Zi Huang, Tong Chen

Keywords Paper

session-based recommendation, streaming recommendation, graph neural networks

0

0

0

0

14:13

22/09/2020

Deep bayesian bandits: Exploring in online personalized recommendations

Dalin Guo, Sofia Ira Ktena, Pranay Kumar Myana and
Ferenc Huszar, Wenzhe Shi, Alykhan Tejani, Michael Kneier, Sourav Das

Keywords Paper

Contextual bandit, Recommender Systems, Algorithmic bias

0

0

0

0

2:59

25/07/2020

Policy-aware unbiased learning to rank for top-k rankings

Harrie Oosterhuis, Maarten Rijke

Keywords Paper

recommendation, selection bias, counterfactual learning to rank, learning to rank, counterfactual learning, top-k ranking

0

0

0

0

20:00

02/02/2021

A User-Adaptive Layer Selection Framework for Very Deep Sequential Recommender Models

Lei Chen, Fajie Yuan, Jiaxi Yang and
Xiang Ao, Chengming Li, Min Yang

Keywords Paper

0

0

0

0

18:18

13/04/2021

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Paper

0

0

0

0

3:05

19/08/2021

User Retention: A Causal Approach with Triple Task Modeling

Yang Zhang, Dong Wang, Qiang Li and
Yue Shen, Ziqi Liu, Xiaodong Zeng, Zhiqiang Zhang, Jinjie Gu, Derek F. Wong

Keywords Paper

Machine Learning, Deep Learning, Applications of Supervised Learning, Recommender Systems

0

0

0

0

13:53

13/04/2021

Budgeted and non-budgeted causal bandits

Vineet Nair, Vishakha Patil, Gaurav Sinha

Keywords Paper

0

0

0

0

3:02

02/02/2021

Dual Sparse Attention Network For Session-based Recommendation

Jiahao Yuan, Zihan Song, Mingyou Sun and
Xiaoling Wang, Wayne Xin Zhao

Keywords Paper

0

0

0

0

14:13

06/12/2020

Self-Supervised Relational Reasoning for Representation Learning

Massimiliano Patacchiola, Amos Storkey

Keywords Paper

0

0

0

0

2:55

03/05/2021

DeepAveragers: Offline Reinforcement Learning By Solving Derived Non-Parametric MDPs

aayam shrestha, Stefan Lee, Prasad Tadepalli, Alan Fern

Keywords Paper

Planning, Offline Reinforcement Learning

0

0

0

0

10:17

02/02/2021

Learning to Recommend from Sparse Data via Generative User Feedback

Wenlin Wang

Keywords Paper

0

0

0

0

14:11

22/09/2020

Improving one-class recommendation with multi-tasking on various preference intensities

Chu-Jen Shao, Hao-Ming Fu, Pu-Jen Cheng

Keywords Paper

implicit feedback, graph convolutional network, one-class recommendation, collaborative filtering

0

0

0

0

2:38

13/04/2021

Free-rider attacks on model aggregation in federated learning

Yann Fraboni, Richard Vidal, Marco Lorenzi

Keywords Paper

0

0

0

0

3:02

25/07/2020

Accelerated convergence for counterfactual learning to rank

Rolf Jagerman, Maarten Rijke

Keywords Paper

unbiased learning, counterfactual learning, learning to rank

0

0

0

0

14:21

06/12/2020

A Variational Approach for Learning from Positive and Unlabeled Data

Hui Chen, Fangqing Liu, Yin Wang and
Liyue Zhao, Hao Wu

Keywords Paper

0

0

0

0

3:13

25/07/2020

Disentangled graph collaborative filtering

Xiang Wang, Hongye Jin, An Zhang and
Xiangnan He, Tong Xu, Tat-Seng Chua

Keywords Paper

explainable recommendation, disentangled representation learning, collaborative filtering, graph neural networks

0

0

0

0

15:17

18/07/2021

Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL

Andrea Zanette

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

16:57

25/07/2020

A general knowledge distillation framework for counterfactual recommendation via uniform data

Dugang Liu, Pengxiang Cheng, Zhenhua Dong and
Xiuqiang He, Weike Pan, Zhong Ming

Keywords Paper

counterfactual learning, uniform data, recommender systems, knowledge distillation

0

0

0

0

14:06

18/07/2021

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Zhuangdi Zhu, Junyuan Hong, Jiayu Zhou

Keywords Paper

Algorithms

0

1

0

0

5:15

22/09/2020

Cascading hybrid bandits: Online learning to rank for relevance and diversity

Chang Li, Haoyun Feng, Maarten Rijke

Keywords Paper

recommender system, contextual bandits, Online learning to rank, result diversification

0

0

0

0

2:51

12/07/2020

Striving for simplicity and performance in off-policy DRL: Output Normalization and Non-Uniform Sampling

Che Wang, Yanqiu Wu, Quan Vuong, Keith Ross

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:47

06/12/2021

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

Tianhe Yu, Aviral Kumar, Yevgen Chebotar and
Karol Hausman, Sergey Levine, Chelsea Finn

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:27