Regret Bounds for Batched Bandits

Abstract: We present simple algorithms for batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve and extend the best known regret bounds of Gao, Han, Ren, and Zhou (NeurIPS 2019), for any number of batches. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also study the batched adversarial multi-armed bandit problem for the first time and provide the optimal regret, up to logarithmic factors, of any algorithm with predetermined batch sizes.

09/07/2020

Regret Bounds for Batched Bandits

Hossein Esfandiari, Amin Karbasi, Abbas Mehrabian, Vahab Mirrokni

Comments

Similar Papers

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

YICHUN HU, Nathan Kallus, Xiaojie Mao

Keywords Abstract Paper

Bandit problems,

Neural Regret-Matching for Distributed Constraint Optimization Problems

Yanchen Deng, Runsheng Yu, Xinrun Wang, Bo An

Keywords Abstract Paper

Agent-based and Multi-agent Systems, Coordination and Cooperation, Constraint Optimization, Distributed Constraints

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Abstract Paper

Near-Optimal Representation Learning for Linear Bandits and Linear RL

Jiachen Hu, Xiaoyu Chen, Chi Jin and Lihong Li, Liwei Wang

Keywords Abstract Paper

Theory, Online Learning Theory

Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure

Aviv Rosenberg, Yishay Mansour

Keywords Abstract Paper

theory, reinforcement learning and planning, online learning

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and Mengxiao Zhang, Xiaojin Zhang

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Asymptotically Optimal Information-Directed Sampling

Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvari

Keywords Abstract Paper

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

Kumar Ashutosh, Jayakrishnan Nair, Anmol Kagrecha, Krishna Jagannathan

Keywords Abstract Paper

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano and Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

Keywords Abstract Paper

reinforcement learning and planning

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Abstract Paper

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Surrogate Regret Bounds for Polyhedral Losses

Rafael Frongillo, Bo Waggoner

Keywords Abstract Paper

machine learning

Beyond $log^2(T)$ regret for decentralized bandits in matching markets

Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Robust Algorithms for TSP and Steiner Tree

Arun Ganesh, Bruce M. Maggs, Debmalya Panigrahi

Keywords Abstract Paper

Robust optimization, Steiner tree, traveling salesman problem

Delay and Cooperation in Nonstochastic Linear Bandits

Shinji Ito, Daisuke Hatano, Hanna Sumita and Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-Ichi Kawarabayashi

Keywords Abstract Paper

Adaptive Discretization for Adversarial Lipschitz Bandits

Chara Podimata, Alex Slivkins

Keywords Abstract Paper

Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition

Tiancheng Jin, Haipeng Luo

Keywords Abstract Paper

Adaptive Sampling for Stochastic Risk-Averse Learning

Sebastian Curi, Kfir Y. Levy, Stefanie Jegelka, Andreas Krause

Keywords Abstract Paper

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

Niladri Chatterji, Vidya Muthukumar, Peter Bartlett

Keywords Abstract Paper

Smooth bandit optimization: Generalization to holder space

Yusha Liu, Yining Wang, Aarti Singh

Keywords Abstract Paper

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Keywords Paper

Keywords Paper

Keywords Paper

Jiachen Hu, Xiaoyu Chen, Chi Jin and
Lihong Li, Liwei Wang

Keywords Paper

Keywords Paper

Keywords Paper

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and
Mengxiao Zhang, Xiaojin Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano and
Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Shinji Ito, Daisuke Hatano, Hanna Sumita and
Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-Ichi Kawarabayashi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori and
Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

Keywords Paper

Keywords Paper

Keywords Paper