First-Order Methods for Wasserstein Distributionally Robust MDP

18/07/2021

First-Order Methods for Wasserstein Distributionally Robust MDP

Julien Grand-Clement, Christian Kroer

Keywords: Theory, RL, Decisions and Control Theory

Abstract Paper Similar Papers

Abstract: Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \textit{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. By developing efficient proximal updates, our algorithms achieve a convergence rate of $O\left(NA^{2.5}S^{3.5}\log(S)\log(\epsilon^{-1})\epsilon^{-1.5} \right)$ for the number of kernels $N$ in the support of the nominal distribution, states $S$, and actions $A$; this rate varies slightly based on the Wasserstein setup. Our dependence on $N,A$ and $S$ is significantly better than existing methods, which have a complexity of $O\left(N^{3.5}A^{3.5}S^{4.5}\log^{2}(\epsilon^{-1}) \right)$. Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Haipeng Luo, Chen-Yu Wei, Chung-Wei Lee

Keywords Paper

optimization, reinforcement learning and planning, bandits

0

0

0

0

15:17

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

18/07/2021

Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

Zhang Zihan, Yuan Zhou, Xiangyang Ji

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:03

18/07/2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin LIANG

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:23

09/07/2020

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

Zhixian Lei, Kyle Luh, Prayaag Venkat, Fred Zhang

Keywords Paper

High-dimensional statistics, Adversarial learning and robustness

0

0

0

0

15:00

06/12/2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

0

0

0

0

3:42

06/12/2020

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

Dheeraj Nagaraj, Xian Wu, Guy Bresler and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

0

0

0

0

3:34

18/07/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

Shuli Jiang, Dongyu Li, Irene Mengze Li and
Arvind Mahankali, David Woodruff

Keywords Paper

Algorithms, Deep Learning, Generative Models, Deep Learning, Predictive Models; Deep Learning, Recurrent Networks

0

0

0

0

7:26

09/07/2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

Yossi Arjevani, Yair Carmon, John Duchi and
Dylan Foster, Ayush Sekhari, Karthik Sridharan

Keywords Paper

Non-convex optimization, Stochastic optimization

0

0

0

0

11:57

09/07/2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham Kakade, Jason Lee, Gaurav Mahajan

Keywords Paper

Reinforcement learning, Non-convex optimization

0

0

0

0

11:00

18/07/2021

Regularized Submodular Maximization at Scale

Ehsan Kazemi, shervin minaee, Moran Feldman, Amin Karbasi

Keywords Paper

Optimization, Combinatorial Optimization

0

0

0

0

5:17

06/12/2021

List-Decodable Mean Estimation in Nearly-PCA Time

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

theory, clustering

0

0

0

0

14:21

06/12/2021

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

meta learning, bandits

0

0

0

0

14:58

06/12/2021

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Jikai Jin, Bohang Zhang, Haiyang Wang, Liwei Wang

Keywords Paper

optimization

0

0

0

0

14:05

06/12/2020

Escaping Saddle-Point Faster under Interpolation-like Conditions

Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant Mohapatra

Keywords Paper

0

0

0

0

3:19

06/12/2021

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Maura Pintor, Fabio Roli, Wieland Brendel, Battista Biggio

Keywords Paper

optimization, machine learning, robustness, adversarial robustness and security, vision

0

0

0

0

11:35

06/12/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser and
Gleb Novikov, David Steurer, Stefan Tiegel

Keywords Paper

optimization

0

0

0

0

10:44

06/12/2021

Twice regularized MDPs and the equivalence between robustness and regularization

Esther Derman, Matthieu Geist, Shie Mannor

Keywords Paper

optimization, reinforcement learning and planning, robustness

0

0

0

0

14:19

06/12/2021

A Comprehensively Tight Analysis of Gradient Descent for PCA

Zhiqiang Xu, Ping Li

Keywords Paper

optimization

0

0

0

0

4:37

06/12/2020

Truncated Linear Regression in High Dimensions

Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

3:17

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

04/08/2021

Functions with average smoothness: structure, algorithms, and learning

Yair Ashlagi, Lee-Ad Gottlieb, Aryeh Kontorovich

Keywords Paper

0

0

0

0

17:22

18/07/2021

Towards Tight Bounds on the Sample Complexity of Average-reward MDPs

Yujia Jin, Aaron Sidford

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:05

09/07/2020

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

Yin Tat Lee, Ruoqi Shen, Kevin Tian

Keywords Paper

Sampling algorithms, Bayesian methods

0

0

0

0

14:57

06/12/2021

Self-Adaptable Point Processes with Nonparametric Time Decays

Zhimeng Pan, Zheng Wang, Jeff M Phillips, Shandian Zhe

Keywords Paper

deep learning, kernel methods

0

0

0

0

10:01

06/12/2021

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian

Keywords Paper

theory, optimization

0

0

0

0

14:22

12/07/2020

Sparse Convex Optimization via Adaptively Regularized Hard Thresholding

Kyriakos Axiotis, Maxim Sviridenko

Keywords Paper

Optimization - General

0

0

0

0

13:44

06/12/2021

Towards Sample-Optimal Compressive Phase Retrieval with Sparse and Generative Priors

Zhaoqiang Liu, Subhroshekhar Ghosh, Jonathan Scarlett

Keywords Paper

theory, optimization, generative model

0

0

0

0

10:41

06/12/2021

STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization

Kfir Levy, Ali Kavis, Volkan Cevher

Keywords Paper

optimization

0

0

0

0

12:23

09/07/2020

Approximation Schemes for ReLU Regression

Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar and
Adam Klivans, Mahdi Soltanolkotabi

Keywords Paper

PAC learning, Approximation algorithms, Convex optimization, Neural networks/deep learning

0

0

0

0

15:20

18/07/2021

Randomized Algorithms for Submodular Function Maximization with a $k$-System Constraint

Shuang Cui, Kai Han, Tianshuai Zhu and
Jing Tang, Benwei Wu, He Huang

Keywords Paper

Optimization

0

0

0

0

4:48

26/08/2020

Stochastic Recursive Variance-Reduced Cubic Regularization Methods

Dongruo Zhou, Quanquan Gu

Keywords Paper

0

0

0

0

15:42

06/12/2021

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Keywords Paper

0

0

0

0

14:56

06/12/2021

Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

Ming Yin, Yu-Xiang Wang

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

8:46

06/12/2020

Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization

Chaobing Song, Yong Jiang, Yi Ma

Keywords Paper

Algorithms -> Boosting and Ensemble Methods; Applications -> Hardware and Systems; Applications -> Natural Language Processing;, Applications -> Privacy, Anonymity, and Security

0

0

0

0

3:23

09/07/2020

Locally Private Hypothesis Selection

Sivakanth Gopi, Gautam Kamath, Janardhan D Kulkarni and
Aleksandar Nikolov, Steven Wu, Huanyu Zhang

Keywords Paper

Privacy, fairness, Distribution learning/testing

0

0

0

0

14:58

06/12/2021

Nearly Horizon-Free Offline Reinforcement Learning

Tongzheng Ren, Jialian Li, Bo Dai and
Simon Du, Sujay Sanghavi

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

8:44

13/04/2021

Reinforcement learning in parametric MDPs with exponential families

Sayak Ray Chowdhury, Aditya Gopalan, Odalric-Ambrym Maillard

Keywords Paper

0

0

0

0

3:22

12/07/2020

Optimal Statistical Guaratees for Adversarially Robust Gaussian Classification

Chen Dan, Yuting Wei, Pradeep Ravikumar

Keywords Paper

Learning Theory

0

0

0

0

14:36

12/07/2020

Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness

Aounon Kumar, Alexander Levine, Tom Goldstein, Soheil Feizi

Keywords Paper

Adversarial Examples

0

0

0

0

14:48