Bandit Phase Retrieval

Abstract: We study a bandit version of phase retrieval where the learner chooses actions $(A_t)_{t=1}^n$ in the $d$-dimensional unit ball and the expected reward is $\langle{A_t, \theta_\star \rangle^2$ with $\theta_\star \in \mathbb R^d$ an unknown parameter vector. We prove an upper bound on the minimax cumulative regret in this problem of $\smash{\tilde \Theta(d \sqrt{n})}$, which matches known lower bounds up to logarithmic factors and improves on the best known upper bound by a factor of $\smash{\sqrt{d}}$. We also show that the minimax simple regret is $\smash{\tilde \Theta(d / \sqrt{n})}$ and that this is only achievable by an adaptive algorithm. Our analysis shows that an apparently convincing heuristic for guessing lower bounds can be misleading and that uniform bounds on the information ratio for information-directed sampling (Russo and Van Roy, 2014) are not sufficient for optimal regret.

18/07/2021

Bandit Phase Retrieval

Tor Lattimore, Botao Hao

Comments

Similar Papers

Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization

Aadirupa Saha, Nagarajan Natarajan, Praneeth Netrapalli, Prateek Jain

Keywords Abstract Paper

Optimization, Convex Optimization

Adversarial Dueling Bandits

Aadirupa Saha, Tomer Koren, Yishay Mansour

Keywords Abstract Paper

Algorithms, Ranking and Preference Learning

Doubly Robust Thompson Sampling with Linear Payoffs

Wonyoung Kim, Gi-Soo Kim, Myunghee Cho Paik

Keywords Abstract Paper

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

UCB-based Algorithms for Multinomial Logistic Regression Bandits

Sanae Amani, Christos Thrampoulidis

Keywords Abstract Paper

Low-rank generalized linear bandit problems

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

Keywords Abstract Paper

Multinomial Logit Contextual Bandits: Provable Optimality and Practicality

Min-hwan Oh, Garud Iyengar

Keywords Abstract Paper

Combinatorial gaussian process bandits with probabilistically triggered arms

Ilker Demirel, Cem Tekin

Keywords Abstract Paper

Optimal Dynamic Regret in Exp-Concave Online Learning

Dheeraj Baby, Yu-Xiang Wang

Keywords Abstract Paper

Taking a hint: How to leverage loss predictors in contextual bandits?

Chen-Yu Wei, Haipeng Luo, Alekh Agarwal

Keywords Abstract Paper

Bandit problems, Online learning

Optimal Algorithms for Stochastic Contextual Preference Bandits

Keywords Abstract Paper

Stage-wise Conservative Linear Bandits

Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh

Keywords Abstract Paper

Instance-wise minimax-optimal algorithms for logistic bandits

Marc Abeille, Louis Faury, Clement Calauzenes

Keywords Abstract Paper

Near-Optimal Representation Learning for Linear Bandits and Linear RL

Jiachen Hu, Xiaoyu Chen, Chi Jin and Lihong Li, Liwei Wang

Keywords Abstract Paper

Theory, Online Learning Theory

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

Zihan Zhang, Jiaqi Yang, Xiangyang Ji, Simon Du

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits

What You See May Not Be What You Get: UCB Bandit Algorithms Robust to $\varepsilon$-Contamination

Laura Niss, Ambuj Tewari

Keywords Abstract Paper

Adversarial Combinatorial Bandits with General Non-linear Reward Functions

Yanjun Han, Yining Wang, Xi Chen

Keywords Abstract Paper

Applications, Computer Vision, Applications, Computational Photography, Theory, Online Learning Theory

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback

Yihan Du, Yuko Kuroki, Wei Chen

Keywords Abstract Paper

Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

Jingfeng Wu, Vladimir Braverman, Lin Yang

Keywords Abstract Paper

theory, reinforcement learning and planning

Naive Exploration is Optimal for Online LQR

Max Simchowitz, Dylan Foster

Keywords Abstract Paper

Impact of Representation Learning in Linear Bandits

Jiaqi Yang, Wei Hu, Jason Lee, Simon Du

Keywords Abstract Paper

multi-task learning, representation learning, linear bandits

Smooth bandit optimization: Generalization to holder space

Yusha Liu, Yining Wang, Aarti Singh

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiachen Hu, Xiaoyu Chen, Chi Jin and
Lihong Li, Liwei Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yingjie Fei, Zhuoran Yang, Yudong Chen and
Zhaoran Wang, Qiaomin Xie

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper