Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits

Abstract: While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a "human-like" way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias.

06/12/2021

Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits

Xinming Liu, Joseph Halpern

Comments

Similar Papers

Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium

Gabriele Farina, Tuomas Sandholm

Keywords Abstract Paper

Obviously Strategyproof Single-Minded Combinatorial Auctions

Bart de Keijzer, Maria Kyropoulou, Carmine Ventre

Keywords Abstract Paper

OSP Mechanisms, Extensive-form Mechanisms, Single-minded Combinatorial Auctions, Greedy algorithms

Adaptive Exploration in Linear Contextual Bandit

Botao Hao, Tor Lattimore, Csaba Szepesvari

Keywords Abstract Paper

Computing Quantal Stackelberg Equilibrium in Extensive-Form Games

Jakub Černý, Viliam Lisý, Branislav Bošanský, Bo An

Keywords Abstract Paper

Alternative Microfoundations for Strategic Classification

Meena Jagadeesan, Celestine Mendler-Dünner, Moritz Hardt

Keywords Abstract Paper

Theory, Game Theory and Computational Economics

Contextual Reserve Price Optimization in Auctions via Mixed Integer Programming

Joey Huchette, Haihao Lu, Hossein Esfandiari, Vahab Mirrokni

Keywords Abstract Paper

Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games

David Milec, Jakub Černý, Viliam Lisý, Bo An

Keywords Abstract Paper

Thompson Sampling Algorithms for Mean-Variance Bandits

Qiuyu Zhu, Vincent Tan

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and Mengxiao Zhang, Xiaojin Zhang

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Diverse rule sets

Guangyi Zhang, Aristides Gionis

Keywords Abstract Paper

sampling, classifier, pattern mining, rule learning, diversification, rule sets

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Abstract Paper

The route to chaos in routing games: When is price of anarchy too optimistic?

Thiparat Chotibut, Fryderyk Falniowski, Michał Misiurewicz, Georgios Piliouras

Keywords Abstract Paper

Rebounding Bandits for Modeling Satiation Effects

Liu Leqi, Fatma Kilinc Karzan, Zachary Lipton, Alan Montgomery

Keywords Abstract Paper

bandits

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Pessimism About Unknown Unknowns Inspires Conservatism

Michael K Cohen, Marcus Hutter

Keywords Abstract Paper

Reinforcement learning, Bayesian methods

Linear models are robust optimal under strategic behavior

Wei Tang, Chien-Ju Ho, Yang Liu

Keywords Abstract Paper

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Abstract Paper

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Siwei Wang, Haoyun Wang, Longbo Huang

Keywords Abstract Paper

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Sebastian Curi, Felix Berkenkamp, Andreas Krause

Keywords Abstract Paper

Robustness Guarantees for Mode Estimation with an Application to Bandits

Aldo Pacchiano, Heinrich Jiang, Michael I. Jordan

Keywords Abstract Paper

Decisions, Counterfactual Explanations and Strategic Behavior

Stratis Tsirtsis, Manuel Gomez Rodriguez

Keywords Abstract Paper

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

bandits

Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and
Mengxiao Zhang, Xiaojin Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Martin Mladenov, Elliot Creager, Omer Ben-Porat and
Kevin Swersky, Richard Zemel, Craig Boutilier

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper