22/09/2020

Contextual meta-bandit for recommender systems selection

Marlesson R. O. Santana, Luckeciano C. Melo, Fernando H. F. Camargo, Bruno Brandão, Anderson Soares, Renan M. Oliveira, Sandor Caetano

Keywords: contextual bandits, hierarchical recommender systems, options framework, reinforcement learning

Abstract: Recommendation systems operate in a highly stochastic and non-stationary environment. As the amount of user-specific information varies, the users’ interests themselves also change. This combination creates a dynamic setting where a single solution will rarely be optimal unless it can keep up with these transformations. One system may perform better than others depending on the situation at hand, thus making the choice of which system to deploy, even more difficult. We address these problems by using the Hierarchical Reinforcement Learning framework. Our proposed meta-bandit acts as a policy over options, where each option maps to a pre-trained, independent recommender system. This meta-bandit learns online and selects a recommender accordingly to the context, adjusting to the situation. We conducted experiments on real data and found that our approach manages to address the dynamics within the user’s changing interests. We also show that it outperforms any of the recommenders separately, as well as an ensemble of them.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at RECSYS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers