(Almost) Free Incentivized Exploration from Decentralized Learning Agents

Abstract: Incentivized exploration in multi-armed bandits (MAB) has witnessed increasing interests and many progresses in recent years, where a principal offers bonuses to agents to do explorations on her behalf. However, almost all existing studies are confined to temporary myopic agents. In this work, we break this barrier and study incentivized exploration with multiple and long-term strategic agents, who have more complicated behaviors that often appear in real-world applications. An important observation of this work is that strategic agents' intrinsic needs of learning benefit (instead of harming) the principal's explorations by providing "free pulls". Moreover, it turns out that increasing the population of agents significantly lowers the principal's burden of incentivizing. The key and somewhat surprising insight revealed from our results is that when there are sufficiently many learning agents involved, the exploration process of the principal can be (almost) free. Our main results are built upon three novel components which may be of independent interest: (1) a simple yet provably effective incentive-provision strategy; (2) a carefully crafted best arm identification algorithm for rewards aggregated under unequal confidences; (3) a high-probability finite-time lower bound of UCB algorithms. Experimental results are provided to complement the theoretical analysis.

26/04/2020

(Almost) Free Incentivized Exploration from Decentralized Learning Agents

Chengshuai Shi, Haifeng Xu, Wei Xiong, Cong Shen

Comments

Similar Papers

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

Roberta Raileanu, Tim Rocktäschel

Keywords Abstract Paper

reinforcement learning, exploration, curiosity

Don’t Do What Doesn’t Matter: Intrinsic Motivation with Action Usefulness

Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Deep Reinforcement Learning

Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments

Daochen Zha, Wenye Ma, Lei Yuan and Xia Hu, Ji Liu

Keywords Abstract Paper

Exploration, Reinforcement Learning, Self-Imitation, Generalization of Reinforcement Learning

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Lulu Zheng, Jiarui Chen, Jianhao Wang and Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Abstract Paper

reinforcement learning and planning

Adversarial Feature Hallucination Networks for Few-Shot Learning

Kai Li, Yulun Zhang, Kunpeng Li, Yun Fu

Keywords Abstract Paper

few-shot learning, data augmentation, feature hallucination, generative adversarial networks

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Abstract Paper

Imitation Learning, Reinforcement Learning

MADE: Exploration via Maximizing Deviation from Explored Regions

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Abstract Paper

reinforcement learning and planning

MOPO: Model-based Offline Policy Optimization

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Abstract Paper

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Jin Zhang, Jianhao Wang, Hao Hu and Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

Keywords Abstract Paper

Algorithms, Multitask, Transfer, and Meta Learning

Tactical Optimism and Pessimism for Deep Reinforcement Learning

Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano and Michael Arbel, Michael Jordan

Keywords Abstract Paper

reinforcement learning and planning, bandits

Fast active learning for pure exploration in reinforcement learning

Pierre MENARD, Omar Darwiche Domingues, Anders Jonsson and Emilie Kaufmann, Edouard Leurent, Michal Valko

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices

Evan Liu, Aditi Raghunathan, Percy Liang, Chelsea Finn

Keywords Abstract Paper

Reinforcement Learning and Planning, Deep RL

DORB: Dynamically Optimizing Multiple Rewards with Bandits

Ramakanth Pasunuru, Han Guo, Mohit Bansal

Keywords Abstract Paper

language tasks, optimization rewards, nlg tasks, question generation

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Sebastian Curi, Felix Berkenkamp, Andreas Krause

Keywords Abstract Paper

Latent Bandits Revisited

Joey Hong, Branislav Kveton, Manzil Zaheer and Yinlam Chow, Amr Ahmed, Craig Boutilier

Keywords Abstract Paper

Meta-RetinaNet for Few-shot Object Detection

Shaoqi Li, Wenfeng Song, Shuai Li and Aimin Hao, Hong Qin

Keywords Abstract Paper

Few shot, object detection, meta-learning, Meta-RetinaNet, Balanced Loss, coefficient vector

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and Jiaming Song, Stefano Ermon

Keywords Abstract Paper

optimization, reinforcement learning and planning, adversarial robustness and security

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Andrea Zanette, Alessandro Lazaric, Mykel J Kochenderfer, Emma Brunskill

Keywords Abstract Paper

Effective Diversity in Population Based Reinforcement Learning

Jack Parker-Holder, Aldo Pacchiano, Krzysztof M Choromanski, Stephen J Roberts

Keywords Abstract Paper

Corruption-robust exploration in episodic reinforcement learning

Thodoris Lykouris, Max Simchowitz, Alex Slivkins, Wen Sun

Keywords Abstract Paper

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer, Ankesh Anand, Rishab Goel and R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Paper

Keywords Paper

Daochen Zha, Wenye Ma, Lei Yuan and
Xia Hu, Ji Liu

Keywords Paper

Lulu Zheng, Jiarui Chen, Jianhao Wang and
Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and
Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Paper

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and
Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Paper

Jin Zhang, Jianhao Wang, Hao Hu and
Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

Keywords Paper

Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano and
Michael Arbel, Michael Jordan

Keywords Paper

Pierre MENARD, Omar Darwiche Domingues, Anders Jonsson and
Emilie Kaufmann, Edouard Leurent, Michal Valko

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Joey Hong, Branislav Kveton, Manzil Zaheer and
Yinlam Chow, Amr Ahmed, Craig Boutilier

Keywords Paper

Shaoqi Li, Wenfeng Song, Shuai Li and
Aimin Hao, Hong Qin

Keywords Paper

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Max Schwarzer, Ankesh Anand, Rishab Goel and
R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Paper

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Syed Zawad, Ahsan Ali, Pin-Yu Chen and
Ali Anwar, Yi Zhou, Nathalie Baracaldo, Yuan Tian, Feng Yan

Keywords Paper

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and
Steve Chien, Úlfar Erlingsson

Keywords Paper

Keywords Paper

Xiaosen Wang, Jiadong Lin, Han Hu and
Jingdong Wang, Kun He

Keywords Paper

Yijie Guo, Jongwook Choi, Marcin Moczulski and
Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Keywords Paper

Keywords Paper

Andrew Jaegle, Yury Sulsky, Arun Ahuja and
Jake Bruce, Rob Fergus, Greg Wayne

Keywords Paper

Ling Pan, Tabish Rashid, Bei Peng and
Longbo Huang, Shimon Whiteson

Keywords Paper

Keywords Paper

Hanbo Zhang, Site Bai, Xuguang Lan and
David Hsu, Nanning Zheng

Keywords Paper

Keywords Paper

Keywords Paper