Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning

18/07/2021

Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning

Sebastian Curi, Ilija Bogunovic, Andreas Krause

Keywords: Reinforcement Learning and Planning

Abstract Paper Similar Papers

Abstract: In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness to such worst-case situations. The robust-RL framework addresses this challenge via a minimax optimization between an agent and an adversary. Previous robust RL algorithms are either sample inefficient, lack robustness guarantees, or do not scale to large problems. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem while attaining near-optimal sample complexity guarantees. RH-UCRL is a model-based reinforcement learning (MBRL) algorithm that effectively distinguishes between epistemic and aleatoric uncertainty and efficiently explores both the agent and the adversary decision spaces during policy learning. We scale RH-UCRL to complex tasks via neural networks ensemble models as well as neural network policies. Experimentally we demonstrate that RH-UCRL outperforms other robust deep RL algorithms in a variety of adversarial environments.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

02/02/2021

GaussianPath:A Bayesian Multi-Hop Reasoning Framework for Knowledge Graph Reasoning

Guojia Wan, Bo Du

Keywords Paper

0

0

0

0

13:52

06/12/2020

On the Stability and Convergence of Robust Adversarial Reinforcement Learning: A Case Study on Linear Quadratic Systems

Kaiqing Zhang, Bin Hu, Tamer Basar

Keywords Paper

0

0

0

0

3:22

06/12/2020

Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration

Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

Keywords Paper

0

0

0

0

3:17

06/12/2020

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method

Qi Zhou, Yufei Kuang, Zherui Qiu and
Houqiang Li, Jie Wang

Keywords Paper

0

0

0

0

3:10

26/08/2020

Nested-Wasserstein Self-Imitation Learning for Sequence Generation

Ruiyi Zhang, Changyou Chen, Zhe Gan and
Zheng Wen, Wenlin Wang, Lawrence Carin

Keywords Paper

0

0

0

0

11:18

19/08/2021

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Yongshuai Liu, Avishai Halev, Xin Liu

Keywords Paper

Machine learning, General, General, General

0

0

0

0

15:25

06/12/2020

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Huan Zhang, Hongge Chen, Chaowei Xiao and
Bo Li, Mingyan Liu, Duane Boning, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:18

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

06/12/2021

Hierarchical Reinforcement Learning with Timed Subgoals

Nico Gürtler, Dieter Büchler, Georg Martius

Keywords Paper

reinforcement learning and planning

0

0

0

0

8:17

06/12/2021

Robust Deep Reinforcement Learning through Adversarial Loss

Tuomas Oikarinen, Wang Zhang, Alexandre Megretski and
Luca Daniel, Tsui-Wei Weng

Keywords Paper

reinforcement learning and planning, robustness, adversarial robustness and security

0

0

0

0

14:15

06/12/2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty

Kaiqing Zhang, TAO SUN, Yunzhe Tao and
Sahika Genc, Sunil Mallya, Tamer Basar

Keywords Paper

0

0

0

0

3:11

26/04/2020

Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control

Tsui-Wei Weng, Krishnamurthy (Dj) Dvijotham, Jonathan Uesato and
Kai Xiao, Sven Gowal, Robert Stanforth*, Pushmeet Kohli

Keywords Paper

deep learning, reinforcement learning, robustness, adversarial examples

0

0

0

0

6:00

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

13/04/2021

Optimizing percentile criterion using robust MDPs

Bahram Behzadian, Reazul Hasan Russel, Marek Petrik, Chin Pang Ho

Keywords Paper

0

0

0

0

3:00

06/12/2021

Automated Dynamic Mechanism Design

Hanrui Zhang, Vincent Conitzer

Keywords Paper

0

0

0

0

14:35

06/12/2021

When Is Generalizable Reinforcement Learning Tractable?

Dhruv Malik, Yuanzhi Li, Pradeep Ravikumar

Keywords Paper

reinforcement learning and planning, generative model, representation learning

0

0

0

0

12:38

12/07/2020

Representations for Stable Off-Policy Reinforcement Learning

Dibya Ghosh, Marc Bellemare

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:38

06/12/2020

Trust the Model When It Is Confident: Masked Model-based Actor-Critic

Feiyang Pan, Jia He, Dandan Tu, Qing He

Keywords Paper

0

0

0

0

2:57

06/12/2021

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Dibya Ghosh, Jad Rahme, Aviral Kumar and
Amy Zhang, Ryan Adams, Sergey Levine

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:17

02/02/2021

Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach

James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras

Keywords Paper

0

0

0

0

16:52

18/07/2021

Continuous-time Model-based Reinforcement Learning

Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:00

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

04/08/2021

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

Dylan Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

Keywords Paper

0

0

0

0

16:53

06/12/2021

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

14:48

03/05/2021

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary

Huan Zhang, Hongge Chen, Duane S Boning, Cho-Jui Hsieh

Keywords Paper

reinforcement learning, robustness, adversarial attacks, adversarial defense

0

0

0

0

5:14

19/04/2021

Evaluating neural model robustness for machine comprehension

Winston Wu, Dustin Arendt, Svitlana Volkova

Keywords Paper

0

0

0

0

11:41

03/05/2021

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Lanqing Li, Rui Yang, Dijun Luo

Keywords Paper

distance metric learning, offline/batch reinforcement learning, meta-reinforcement learning, contrastive learning, multi-task reinforcement learning

1

0

0

0

6:21

06/12/2020

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Sebastian Curi, Felix Berkenkamp, Andreas Krause

Keywords Paper

0

0

0

0

3:23

06/12/2021

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Nicklas Hansen, Hao Su, Xiaolong Wang

Keywords Paper

reinforcement learning and planning, transformers

0

0

0

0

8:43

18/07/2021

Temporal Predictive Coding For Model-Based Planning In Latent Space

Tung Nguyen, Rui Shu, Tuan Pham and
Hung Bui, Stefano Ermon

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:19

12/07/2020

Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?

Kei Ota, Tomoaki Oiki, Devesh Jha and
Toshisada Mariyama, Daniel Nikovski

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:55

06/12/2021

Twice regularized MDPs and the equivalence between robustness and regularization

Esther Derman, Matthieu Geist, Shie Mannor

Keywords Paper

optimization, reinforcement learning and planning, robustness

0

0

0

0

14:19

12/07/2020

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Masatoshi Uehara, Jiawei Huang, Nan Jiang

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:20

06/12/2021

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Seohong Park, Jaekyeom Kim, Gunhee Kim

Keywords Paper

reinforcement learning and planning

0

0

0

0

8:53

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

03/05/2021

Adversarially Guided Actor-Critic

Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin and
philippe preux, Matthieu Geist

Keywords Paper

0

0

0

0

4:22

06/12/2020

Weakly-Supervised Reinforcement Learning for Controllable Behavior

Lisa Lee, Benjamin Eysenbach, Russ Salakhutdinov and
Shixiang (Shane) Gu, Chelsea Finn

Keywords Paper

0

0

0

0

3:31

06/12/2021

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Giora Simchoni, Saharon Rosset

Keywords Paper

deep learning, machine learning, vision

0

0

0

0

13:33

06/12/2020

Bayes Consistency vs. H-Consistency: The Interplay between Surrogate Loss Functions and the Scoring Function Class

Mingyuan Zhang, Shivani Agarwal

Keywords Paper

0

0

0

0

3:19