Budget-Constrained Bandits over General Cost and Reward Distributions

Abstract: We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return. The objective is to maximize the total expected reward under a budget constraint on the total cost. The model is general in the sense that it allows correlated and potentially heavy-tailed cost-reward pairs that can take on negative values as required by many applications. We show that if moments of order $(2+\gamma)$ for some $\gamma > 0$ exist for all cost-reward pairs, $O(\log B)$ regret is achievable for a budget $B>0$. In order to achieve tight regret bounds, we propose algorithms that exploit the correlation between the cost and reward of each arm by extracting the common information via linear minimum mean-square error estimation. We prove a regret lower bound for this problem, and show that the proposed algorithms achieve tight problem-dependent regret bounds, which are optimal up to a universal constant factor in the case of jointly Gaussian cost and reward pairs.

12/07/2020

Algorithms -> Density Estimation; Algorithms -> Unsupervised Learning; Applications -> Computer Vision, Deep Learning -> Generative Models

3:18

06/12/2021

Budget-Constrained Bandits over General Cost and Reward Distributions

Semih Cayci, Atilla Eryilmaz, R Srikant

Comments

Similar Papers

The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Zhe Feng, David Parkes, Haifeng Xu

Keywords Abstract Paper

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

Stochastic bandits with groups of similar arms.

Fabien Pesquerel, Hassan SABER, Odalric-Ambrym Maillard

Keywords Abstract Paper

optimization, generative model, bandits

Stochastic bandits with linear constraints

Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

Keywords Abstract Paper

Corralling stochastic bandit algorithms

Raman Arora, Teodor Vanislavov Marinov, Mehryar Mohri

Keywords Abstract Paper

Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

Mohsen Bayati, Nima Hamidi, Ramesh Johari, Khashayar Khosravi

Keywords Abstract Paper

Fair Algorithms for Multi-Agent Multi-Armed Bandits

Safwan Hossain, Evi Micha, Nisarg Shah

Keywords Abstract Paper

bandits, fairness

Smooth bandit optimization: Generalization to holder space

Yusha Liu, Yining Wang, Aarti Singh

Keywords Abstract Paper

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Abstract Paper

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

Keywords Abstract Paper

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Siwei Wang, Haoyun Wang, Longbo Huang

Keywords Abstract Paper

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Abstract Paper

Adversarial Blocking Bandits

Nicholas Bishop, Hau Chan, Debmalya Mandal, Long Tran-Thanh

Keywords Abstract Paper

Tight Lower Bounds for Combinatorial Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Abstract Paper

Bandit problems, Learning with algebraic or combinatorial structure

The Symmetry between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks

Xiaocheng Li, Chunlin Sun, Yinyu Ye

Keywords Abstract Paper

Algorithms, Online Learning, Algorithms, Bandit Algorithms, Reinforcement Learning and Planning, Bandits

Finite Continuum-Armed Bandits

Keywords Abstract Paper

Algorithms -> Density Estimation; Algorithms -> Unsupervised Learning; Applications -> Computer Vision, Deep Learning -> Generative Models

Rebounding Bandits for Modeling Satiation Effects

Liu Leqi, Fatma Kilinc Karzan, Zachary Lipton, Alan Montgomery

Keywords Abstract Paper

Contextual Combinatorial Volatile Multi-armed Bandit with Adaptive Discretization

Andi Nika, Sepehr Elahi, Cem Tekin

Keywords Abstract Paper

Finite-Time Analysis of Round-Robin Kullback-Leibler Upper Confidence Bounds for Optimal Adaptive Allocation with Multiple Plays and Markovian Rewards

Keywords Abstract Paper

Bandits with many optimal arms

Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier

Keywords Abstract Paper

Dynamic Planning and Learning under Recovering Rewards

David Simchi-Levi, Zeyu Zheng, Feng Zhu

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Incentivized Bandit Learning with Self-Reinforcing User Preferences

Tianchen Zhou, Jia Liu, Chaosheng Dong, jingyuan deng

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

A Bandit Learning Algorithm and Applications to Auction Design

Keywords Abstract Paper

Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs

jiafan he, Dongruo Zhou, Quanquan Gu

Keywords Abstract Paper

Keywords Paper

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Alexia Atsidakou, Orestis Papadigenopoulos, Soumya Basu and
Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhi Wang, Chicheng Zhang, Manish Kumar Singh and
Laurel Riek, Kamalika Chaudhuri

Keywords Paper

Keywords Paper

Yingjie Fei, Zhuoran Yang, Yudong Chen and
Zhaoran Wang, Qiaomin Xie

Keywords Paper