Making Non-Stochastic Control (Almost) as Easy as Stochastic

Abstract: Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem where a learner attempts to optimally control an unknown linear dynamical system with fully observed state, perturbed by i.i.d. Gaussian noise. \iftoggle{nips}{The}{It is now understood that the} optimal regret over time horizon $T$ against the optimal control law scales as $\widetilde{\Theta}(\sqrt{T})$. In this paper, we show that the same regret rate (against a suitable benchmark) is attainable even in the considerably more general non-stochastic control model, where the system is driven by \emph{arbitrary adversarial} noise \citep{agarwal2019online}. We attain the optimal $\widetilde{\mathcal{O}}(\sqrt{T})$ regret when the dynamics are unknown to the learner, and $\mathrm{poly}(\log T)$ regret when known, provided that the cost functions are strongly convex (as in LQR). Our algorithm is based on a novel variant of online Newton step \citep{hazan2007logarithmic}, which adapts to the geometry induced by adversarial disturbances, and our analysis hinges on generic regret bounds for certain structured losses in the OCO-with-memory framework \citep{anava2015online}.

12/07/2020

Making Non-Stochastic Control (Almost) as Easy as Stochastic

Comments

Similar Papers

Naive Exploration is Optimal for Online LQR

Max Simchowitz, Dylan Foster

Keywords Abstract Paper

Improper Learning for Non-Stochastic Control

Max Simchowitz, Karan Singh, Elad Hazan

Keywords Abstract Paper

Reinforcement learning, Online learning, Planning and control

Adaptive Online Estimation of Piecewise Polynomial Trends

Dheeraj Baby, Yu-Xiang Wang

Keywords Abstract Paper

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

Dynamic Regret of Convex and Smooth Functions

Peng Zhao, Yu-Jie Zhang, Lijun Zhang, Zhi-Hua Zhou

Keywords Abstract Paper

Optimal Dynamic Regret in Exp-Concave Online Learning

Dheeraj Baby, Yu-Xiang Wang

Keywords Abstract Paper

Taking a hint: How to leverage loss predictors in contextual bandits?

Chen-Yu Wei, Haipeng Luo, Alekh Agarwal

Keywords Abstract Paper

Bandit problems, Online learning

Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization

Aadirupa Saha, Nagarajan Natarajan, Praneeth Netrapalli, Prateek Jain

Keywords Abstract Paper

Optimization, Convex Optimization

Learning piecewise Lipschitz functions in changing environments

Dravyansh Sharma, Maria-Florina Balcan, Travis Dick

Keywords Abstract Paper

Regret Bounds for Online Kernel Selection in Continuous Kernel Space

Xiao Zhang, Shizhong Liao, Jun Xu, Ji-Rong Wen

Keywords Abstract Paper

Projection-free Online Learning in Dynamic Environments

Yuanyu Wan, Bo Xue, Lijun Zhang

Keywords Abstract Paper

Black-Box Control for Linear Dynamical Systems

Xinyi Chen, Elad Hazan

Keywords Abstract Paper

Logarithmic Regret for Online Control with Adversarial Noise

Dylan Foster, Max Simchowitz

Keywords Abstract Paper

Near-Optimal Representation Learning for Linear Bandits and Linear RL

Jiachen Hu, Xiaoyu Chen, Chi Jin and Lihong Li, Liwei Wang

Keywords Abstract Paper

Theory, Online Learning Theory

Near-optimal Regret Bounds for Stochastic Shortest Path

Aviv Rosenberg, Alon Cohen, Yishay Mansour, Haim Kaplan

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Logistic Regression Regret: What’s the Catch?

Keywords Abstract Paper

Online learning, Convex optimization, Information theory, Regression

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Gabriele Farina, Tuomas Sandholm

Keywords Abstract Paper

No-Regret Prediction in Marginally Stable Systems

Udaya Ghai, Holden Lee, Karan Singh and Cyril Zhang, Yi Zhang

Keywords Abstract Paper

Online learning, Planning and control

Littlestone Classes are Privately Online Learnable

Noah Golowich, Roi Livni

Keywords Abstract Paper

machine learning, online learning, privacy

Adaptive Sampling for Stochastic Risk-Averse Learning

Sebastian Curi, Kfir Y. Levy, Stefanie Jegelka, Andreas Krause

Keywords Abstract Paper

Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition

Liyu Chen, Haipeng Luo, Chen-Yu Wei

Keywords Abstract Paper

Surrogate Regret Bounds for Polyhedral Losses

Rafael Frongillo, Bo Waggoner

Keywords Abstract Paper

Online Markov Decision Processes with Aggregate Bandit Feedback

Alon Cohen, Haim Kaplan, Tomer Koren, Yishay Mansour

Keywords Abstract Paper

Non-stationary Reinforcement Learning without Prior Knowledge: an Optimal Black-box Approach

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiachen Hu, Xiaoyu Chen, Chi Jin and
Lihong Li, Liwei Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Udaya Ghai, Holden Lee, Karan Singh and
Cyril Zhang, Yi Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jeffrey Negrea, Blair Bilodeau, Nicolò Campolongo and
Francesco Orabona, Dan Roy

Keywords Paper

Keywords Paper

Keywords Paper

Zeyu Jia, Lin Yang, Csaba Szepesvari and
Mengdi Wang, Alex Ayoub

Keywords Paper