Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations

18/07/2021

Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations

Angeliki Kamoutsi, Goran Banjac, John Lygeros

Keywords: Theory, RL, Decisions and Control Theory

Abstract Paper Similar Papers

Abstract: We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations. We assume that the learner is not allowed to interact with the expert and has no access to reinforcement signal of any kind. Existing inverse reinforcement learning methods come with strong theoretical guarantees, but are computationally expensive, while state-of-the-art policy optimization algorithms achieve significant empirical success, but are hampered by limited theoretical understanding. To bridge the gap between theory and practice, we introduce a novel bilinear saddle-point framework using Lagrangian duality. The proposed primal-dual viewpoint allows us to develop a model-free provably efficient algorithm through the lens of stochastic convex optimization. The method enjoys the advantages of simplicity of implementation, low memory requirements, and computational and sample complexities independent of the number of states. We further present an equivalent no-regret online-learning interpretation.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Learning Online Algorithms with Distributional Advice

Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos and
Ali Vakilian, Nikos Zarifis

Keywords Paper

Algorithms

0

0

0

0

5:45

18/07/2021

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Yuda Song, Wen Sun

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

26/08/2020

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh

Keywords Paper

0

0

0

0

14:54

06/12/2020

Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

Arun Suggala, Praneeth Netrapalli

Keywords Paper

1

1

0

0

3:29

06/12/2020

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Fei Feng, Ruosong Wang, Wotao Yin and
Simon Du, Lin Yang

Keywords Paper

Reinforcement Learning and Planning -> Decision and Control, Probabilistic Methods -> Gaussian Processes

0

0

0

0

3:11

12/07/2020

Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization

Vien Mai, Mikael Johansson

Keywords Paper

Optimization - Non-convex

0

0

0

0

15:49

13/04/2021

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Paper

0

0

0

0

3:05

06/12/2021

Neural Active Learning with Performance Guarantees

Zhilei Wang, Pranjal Awasthi, Christoph Dann and
Ayush Sekhari, Claudio Gentile

Keywords Paper

deep learning, active learning

0

0

0

0

10:43

06/12/2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Keywords Paper

0

0

0

0

3:13

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

09/07/2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement learning, Planning and control

0

0

0

0

15:16

06/12/2021

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Yiqin Yang, Xiaoteng Ma, Li Chenghao and
Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:36

18/07/2021

A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization

Risheng Liu, Xuan Liu, Xiaoming Yuan and
Shangzhi Zeng, Jin Zhang

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:12

06/12/2020

An efficient nonconvex reformulation of stagewise convex optimization problems

Rudy Bunel, Oliver Hinder, Srinadh Bhojanapalli, Krishnamurthy Dvijotham

Keywords Paper

0

0

0

0

3:01

06/12/2020

Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization

Dmitry Kovalev, Adil Salim, Peter Richtarik

Keywords Paper

0

0

0

0

3:27

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

04/07/2020

Active Imitation Learning with Noisy Guidance

Kianté Brantley, Hal Daumé III, Amr Sharaf

Keywords Paper

Active Learning, structured tasks, sequence tasks, Imitation algorithms

0

0

0

0

7:59

26/04/2020

On the Convergence of FedAvg on Non-IID Data

Xiang Li, Kaixuan Huang, Wenhao Yang and
Shusen Wang, Zhihua Zhang

Keywords Paper

Federated Learning, stochastic optimization, Federated Averaging

0

0

0

0

13:58

03/05/2021

Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation

Justin Fu, Sergey Levine

Keywords Paper

model-based optimization, normalized maximum likelihood

0

0

0

0

7:37

02/02/2021

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

0

0

0

0

14:30

19/08/2021

Fast Multi-label Learning

Xiuwen Gong, Dong Yuan, Wei Bao

Keywords Paper

Machine Learning, Multi-instance; Multi-label; Multi-view learning

0

0

0

0

15:18

06/12/2020

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Paul Barde, Julien Roy, Wonseok Jeon and
Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Paper

0

0

0

0

3:08

02/02/2021

Newton Optimization on Helmholtz Decomposition for Continuous Games

Giorgia Ramponi, Marcello Restelli

Keywords Paper

0

0

0

0

17:15

04/08/2021

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Paper

0

0

0

0

20:29

06/12/2020

POMO: Policy Optimization with Multiple Optima for Reinforcement Learning

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim and
Iljoo Yoon, Youngjune Gwon, Seungjai Min

Keywords Paper

0

0

0

0

3:19

06/12/2021

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

14:48

03/05/2021

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

Zeyuan Allen-Zhu, Faeze Ebrahimianghazani, Jerry Li, Dan Alistarh

Keywords Paper

Byzantine resilience, robust deep learning, distributed deep learning, distributed machine learning, non-convex optimization

0

0

0

0

6:16

19/08/2021

Reinforcement Learning for Route Optimization with Robustness Guarantees

Tobias Jacobs, Francesco Alesiani, Gulcin Ermis

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Planning under Uncertainty, Applications of Reinforcement Learning

0

0

0

0

13:04

14/09/2020

Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method

Di Wang, Jinhui Xu

Keywords Paper

differential privacy, empirical risk minimization, private machine learning

0

0

0

0

15:13

26/08/2020

Locally Accelerated Conditional Gradients

Jelena Diakonikolas, Alejandro Carderera, Sebastian Pokutta

Keywords Paper

0

0

0

0

13:48

03/05/2021

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Lanqing Li, Rui Yang, Dijun Luo

Keywords Paper

distance metric learning, offline/batch reinforcement learning, meta-reinforcement learning, contrastive learning, multi-task reinforcement learning

1

0

0

0

6:21

12/07/2020

Training Binary Neural Networks using the Bayesian Learning Rule

Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan

Keywords Paper

Deep Learning - General

0

0

0

0

10:27

13/04/2021

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Paper

0

0

0

0

2:58

26/08/2020

One Sample Stochastic Frank-Wolfe

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

0

0

0

0

6:05

14/06/2020

On the Acceleration of Deep Learning Model Parallelism With Staleness

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

layer-wise staleness, asynchronous model parallelism, convolutional neural networks.

0

0

0

0

1:01

26/08/2020

Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning

Andrew Silva, Matthew Gombolay, Taylor Killian and
Ivan Jimenez, Sung-Hyun Son

Keywords Paper

0

0

0

0

12:19

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

06/12/2021

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas and
Rahul Kidambi, Wen Sun

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

8:37

02/02/2021

Frugal Optimization for Cost-related Hyperparameters

Qingyun Wu, Chi Wang, Silu Huang

Keywords Paper

0

0

0

0

16:07

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25