DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits

Abstract: We consider the bandit problem of selecting K out of N arms at each time step. The joint reward can be a non-linear function of the rewards of the selected individual arms. The direct use of a multi-armed bandit algorithm requires choosing among all possible combinations, making the action space large. To simplify the problem, existing works on combinatorial bandits typically assume feedback as a linear function of individual rewards. In this paper, we prove the lower bound for top-K subset selection with bandit feedback with possibly correlated rewards. We present a novel algorithm for the combinatorial setting without using individual arm feedback or requiring linearity of the reward function. Additionally, our algorithm works on correlated rewards of individual arms. Our algorithm, aDaptive Accept RejecT (DART), sequentially finds good arms and eliminates bad arms based on confidence bounds. DART is computationally efficient and uses storage linear in N. Further, DART achieves a regret bound of Õ(K√KNT) for a time horizon T, which matches the lower bound in bandit feedback up to a factor of √log 2NT. When applied to the problem of cross-selling optimization and maximizing the mean of individual rewards, the performance of the proposed algorithm surpasses that of state-of-the-art algorithms. We also show that DART significantly outperforms existing methods for both linear and non-linear joint reward environments.

06/12/2021

DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits

Mridul Agarwal, Vaneet Aggarwal, Abhishek Kumar Umrawal, Chris Quinn

Comments

Similar Papers

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

bandits

Efficient and robust algorithms for adversarial linear contextual bandits

Gergely Neu, Julia Olkhovskaya

Keywords Abstract Paper

Bandit problems, Online learning

On Regret with Multiple Best Arms

Yinglun Zhu, Robert Nowak

Keywords Abstract Paper

Contextual blocking bandits

Soumya Basu, Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai

Keywords Abstract Paper

Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

Mohsen Bayati, Nima Hamidi, Ramesh Johari, Khashayar Khosravi

Keywords Abstract Paper

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

Zixin Zhong, Wang Chi Cheung, Vincent Tan

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Siwei Wang, Haoyun Wang, Longbo Huang

Keywords Abstract Paper

Choice Bandits

Arpit Agarwal, Nicholas Johnson, Shivani Agarwal

Keywords Abstract Paper

From Finite to Countable-Armed Bandits

Anand Kalvit, Assaf Zeevi

Keywords Abstract Paper

, Theory -> Control Theory

Identification of the Generalized Condorcet Winner in Multi-dueling Bandits

Björn Haddenhorst, Viktor Bengs, Eyke Hüllermeier

Keywords Abstract Paper

theory, bandits

Tight Lower Bounds for Combinatorial Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Abstract Paper

Bandit problems, Learning with algebraic or combinatorial structure

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Structure Adaptive Algorithms for Stochastic Bandits

Rémy Degenne, Han Shao, Wouter Koolen

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Disposable Linear Bandits for Online Recommendations

Melda Korkut, Andrew Li

Keywords Abstract Paper

Stochastic bandits with groups of similar arms.

Fabien Pesquerel, Hassan SABER, Odalric-Ambrym Maillard

Keywords Abstract Paper

optimization, generative model, bandits

Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions

Kei Takemura, Shinji Ito, Daisuke Hatano and Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Keywords Abstract Paper

Adversarial Combinatorial Bandits with General Non-linear Reward Functions

Yanjun Han, Yining Wang, Xi Chen

Keywords Abstract Paper

Applications, Computer Vision, Applications, Computational Photography, Theory, Online Learning Theory

Adversarial Blocking Bandits

Nicholas Bishop, Hau Chan, Debmalya Mandal, Long Tran-Thanh

Keywords Abstract Paper

Tsallis-INF for Decoupled Exploration and Exploitation in Multi-armed Bandits

Chloé Rouyer , Yevgeny Seldin

Keywords Abstract Paper

Bandit problems, Online learning

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Abstract Paper

Reinforcement learning, Planning and control

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Abstract Paper

On Reward-Free Reinforcement Learning with Linear Function Approximation

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kei Takemura, Shinji Ito, Daisuke Hatano and
Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Joey Hong, Branislav Kveton, Manzil Zaheer and
Yinlam Chow, Amr Ahmed, Craig Boutilier

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Sungryull Sohn, Sungtae Lee, Jongwook Choi and
Harm van Seijen, Mehdi Fatemi, Honglak Lee

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Qiao Tang, Hong Xie, Yunni Xia and
Jia Lee, Qingsheng Zhu

Keywords Paper

Keywords Paper

Kei Takemura, Shinji Ito, Daisuke Hatano and
Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Keywords Paper

Keywords Paper