Stochastic Bandits with Graph Feedback in Non-Stationary Environments

02/02/2021

Stochastic Bandits with Graph Feedback in Non-Stationary Environments

Shiyin Lu, Yao Hu, Lijun Zhang

Keywords:

Abstract Paper Similar Papers

Abstract: We study a variant of stochastic bandits where the feedback model is specified by a graph. In this setting, after playing an arm, one can observe rewards of not only the played arm but also other arms that are adjacent to the played arm in the graph. Most of the existing work assumes the reward distributions are stationary over time, which, however, is often violated in common scenarios such as recommendation systems and online advertising. To address this limitation, we study stochastic bandits with graph feedback in non-stationary environments and propose algorithms with graph-dependent dynamic regret bounds. When the number of reward distribution changes L is known in advance, one of our algorithms achieves an Õ(√(αLT)) dynamic regret bound. We also develop an adaptive algorithm that can adapt to unknown L and attain an Õ(√(θLT)) dynamic regret. Here, α and θ are some graph-dependent quantities and T is the time horizon.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38949063

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Stochastic Graphical Bandits with Adversarial Corruptions

Shiyin Lu, Guanghui Wang, Lijun Zhang

Keywords Paper

0

0

0

0

17:05

26/04/2020

Causal Discovery with Reinforcement Learning

Shengyu Zhu, Ignavier Ng, Zhitang Chen

Keywords Paper

causal discovery, structure learning, reinforcement learning, directed acyclic graph

0

0

0

0

12:51

02/02/2021

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Siwei Wang, Haoyun Wang, Longbo Huang

Keywords Paper

0

0

0

0

19:29

06/12/2021

Recurrent Submodular Welfare and Matroid Blocking Semi-Bandits

Orestis Papadigenopoulos, Constantine Caramanis

Keywords Paper

bandits

0

0

0

0

12:28

09/07/2020

A Closer Look at Small-loss Bounds for Bandits with Graph Feedback

Chung-Wei Lee, Haipeng Luo, Mengxiao Zhang

Keywords Paper

Bandit problems, Online learning

0

0

0

0

14:52

13/04/2021

Contextual blocking bandits

Soumya Basu, Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

0

0

0

0

2:47

06/12/2021

Beyond Bandit Feedback in Online Multiclass Classification

Dirk van der Hoeven, Federico Fusco, Nicolò Cesa-Bianchi

Keywords Paper

reinforcement learning and planning, machine learning, graph learning, bandits, online learning

0

0

0

0

13:14

02/02/2021

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Gabriele Farina, Tuomas Sandholm

Keywords Paper

0

0

0

0

17:09

02/02/2021

Adversarial Linear Contextual Bandits with Graph-Structured Side Observations

Lingda Wang, Bingcong Li, Huozhi Zhou and
Georgios B. Giannakis, Lav R. Varshney, Zhizhen Zhao

Keywords Paper

0

0

0

0

14:14

02/02/2021

Reinforcement Learning with Trajectory Feedback

Yonathan Efroni, Nadav Merlis, Shie Mannor

Keywords Paper

0

0

0

0

14:17

02/02/2021

Projection-free Online Learning in Dynamic Environments

Yuanyu Wan, Bo Xue, Lijun Zhang

Keywords Paper

0

0

0

0

15:41

06/12/2020

Stage-wise Conservative Linear Bandits

Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh

Keywords Paper

0

0

0

0

3:18

03/05/2021

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

Keywords Paper

reinforcement learning, model-predictive control

0

0

0

0

5:09

12/07/2020

Exploration Through Bias: Revisiting Biased Maximum Likelihood Estimation in Stochastic Multi-Armed Bandits

Xi Liu, Ping-Chun Hsieh, Yu Heng Hung and
Anirban Bhattacharya, P. Kumar

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

14:46

06/12/2020

Stateful Posted Pricing with Vanishing Regret via Dynamic Deterministic Markov Decision Processes

Yuval Emek, Ron Lavi, Rad Niazadeh, Yangguang Shi

Keywords Paper

0

0

0

0

3:10

06/12/2020

Delay and Cooperation in Nonstochastic Linear Bandits

Shinji Ito, Daisuke Hatano, Hanna Sumita and
Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-Ichi Kawarabayashi

Keywords Paper

0

0

0

0

3:19

06/12/2020

Reinforcement Learning with Feedback Graphs

Christoph Dann, Yishay Mansour, Mehryar Mohri and
Ayush Sekhari, Karthik Sridharan

Keywords Paper

0

0

0

0

3:22

04/08/2021

Online Markov Decision Processes with Aggregate Bandit Feedback

Alon Cohen, Haim Kaplan, Tomer Koren, Yishay Mansour

Keywords Paper

0

0

0

0

13:07

19/08/2021

Epsilon Best Arm Identification in Spectral Bandits

Tomáš Kocák, Aurélien Garivier

Keywords Paper

Machine Learning, Learning Theory, Online Learning

0

0

0

0

9:20

13/04/2021

Tracking regret bounds for online submodular optimization

Tatsuya Matsuoka, Shinji Ito, Naoto Ohsaka

Keywords Paper

0

0

0

0

2:10

02/02/2021

Policy Optimization as Online Learning with Mediator Feedback

Alberto Maria Metelli, Matteo Papini, Pierluca D'Oro, Marcello Restelli

Keywords Paper

0

0

0

0

16:44

13/04/2021

Budgeted and non-budgeted causal bandits

Vineet Nair, Vishakha Patil, Gaurav Sinha

Keywords Paper

0

0

0

0

3:02

06/12/2021

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

bandits

0

0

0

0

12:07

12/07/2020

Learning with Good Feature Representations in Bandits and in RL with a Generative Model

Gellért Weisz, Tor Lattimore, Csaba Szepesvari

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

15:20

03/08/2020

Regret Analysis of Bandit Problems with Causal Background Knowledge

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, William Yan

Keywords Paper

0

0

0

0

7:32

18/07/2021

Dynamic Planning and Learning under Recovering Rewards

David Simchi-Levi, Zeyu Zheng, Feng Zhu

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

4:53

09/07/2020

Tight Lower Bounds for Combinatorial Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Paper

Bandit problems, Learning with algebraic or combinatorial structure

0

0

0

0

14:00

06/12/2020

A Bandit Learning Algorithm and Applications to Auction Design

Kim Thang Nguyen

Keywords Paper

0

0

0

0

2:43

04/08/2021

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Paper

0

0

0

0

20:29

06/12/2021

Stochastic bandits with groups of similar arms.

Fabien Pesquerel, Hassan SABER, Odalric-Ambrym Maillard

Keywords Paper

optimization, generative model, bandits

0

0

0

0

13:22

18/07/2021

Joint Online Learning and Decision-making via Dual Mirror Descent

Alfonso Lobos Ruiz, Paul Grigas, Zheng Wen

Keywords Paper

Deep Learning, Generative Models, Applications, Computer Vision; Applications, Visual Scene Analysis and Interpretation; Deep Learning, Adversarial Network, Algorithms, Online Learning Algorithms

0

0

0

0

5:15

02/02/2021

A Primal-Dual Online Algorithm for Online Matching Problem in Dynamic Environments

Yu-Hang Zhou, Peng Hu, Chen Liang and
Huan Xu, Guangda Huzhang, Yinfu Feng, Qing Da, Xinshang Wang, An-Xiang Zeng

Keywords Paper

0

0

0

0

18:32

26/08/2020

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Paper

0

0

0

0

13:06

06/12/2021

Rebounding Bandits for Modeling Satiation Effects

Liu Leqi, Fatma Kilinc Karzan, Zachary Lipton, Alan Montgomery

Keywords Paper

bandits

0

0

0

0

13:49

18/07/2021

Combinatorial Blocking Bandits with Stochastic Delays

Alexia Atsidakou, Orestis Papadigenopoulos, Soumya Basu and
Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:12

03/05/2021

Learning to Make Decisions via Submodular Regularization

Ayya Alieva, Aiden Aceves, Jialin Song and
Stephen Mayo, Yisong Yue, Yuxin Chen

Keywords Paper

0

0

0

0

5:53

26/08/2020

Thresholding Graph Bandits with GrAPL

Daniel LeJeune, Gautam Dasarathy, Richard Baraniuk

Keywords Paper

0

0

0

0

6:05

18/07/2021

Model-based Reinforcement Learning for Continuous Control with Posterior Sampling

Ying Fan, Yifei Ming

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

18:34

02/02/2021

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Paper

0

0

0

0

20:14

18/07/2021

Dynamic Balancing for Model Selection in Bandits and RL

Ashok Cutkosky, Christoph Dann, Abhimanyu Das and
Claudio Gentile, Aldo Pacchiano, Manish Purohit

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:18