Robust Reinforcement Learning for Continuous Control with Model Misspecification

26/04/2020

Robust Reinforcement Learning for Continuous Control with Model Misspecification

Daniel J. Mankowitz, Nir Levine, Rae Jeong, Abbas Abdolmaleki, Jost Tobias Springenberg, Yuanyuan Shi, Jackie Kay, Todd Hester, Timothy Mann, Martin Riedmiller

Keywords: reinforcement learning, robustness

Abstract Paper Similar Papers

Abstract: We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms. We specifically focus on incorporating robustness into a state-of-the-art continuous control RL algorithm called Maximum a-posteriori Policy Optimization (MPO). We achieve this by learning a policy that optimizes for a worst case, entropy-regularized, expected return objective and derive a corresponding robust entropy-regularized Bellman contraction operator. In addition, we introduce a less conservative, soft-robust, entropy-regularized objective with a corresponding Bellman operator. We show that both, robust and soft-robust policies, outperform their non-robust counterparts in nine Mujoco domains with environment perturbations. In addition, we show improved robust performance on a challenging, simulated, dexterous robotic hand. Finally, we present multiple investigative experiments that provide a deeper insight into the robustness framework; including an adaptation to another continuous control RL algorithm. Performance videos can be found online at https://sites.google.com/view/robust-rl.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Pure Exploration in Kernel and Neural Bandits

Yinglun Zhu, Dongruo Zhou, Ruoxi Jiang and
Quanquan Gu, Rebecca Willett, Robert Nowak

Keywords Paper

theory, deep learning, reinforcement learning and planning, bandits, representation learning

0

0

0

0

14:47

18/07/2021

Monotonic Robust Policy Optimization with Model Discrepancy

yuankun jiang, Chenglin Li, Wenrui Dai and
Junni Zou, Hongkai Xiong

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:17

26/04/2020

Variational Recurrent Models for Solving Partially Observable Control Tasks

Dongqi Han, Kenji Doya, Jun Tani

Keywords Paper

Reinforcement Learning, Deep Learning, Variational Inference, Recurrent Neural Network, Partially Observable, Robotic Control, Continuous Control

0

0

0

0

4:59

26/08/2020

A Nonparametric Off-Policy Policy Gradient

Samuele Tosatto, Joao Carvalho, Hany Abdulsamad, Jan Peters

Keywords Paper

0

0

0

0

12:19

12/07/2020

ControlVAE: Controllable Variational Autoencoder

Huajie Shao, Shuochao Yao, Dachun Sun and
Aston Zhang, Shengzhong Liu, Dongxin Liu, Jun Wang, Tarek Abdelzaher

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

14:22

03/05/2021

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Lanqing Li, Rui Yang, Dijun Luo

Keywords Paper

distance metric learning, offline/batch reinforcement learning, meta-reinforcement learning, contrastive learning, multi-task reinforcement learning

1

0

0

0

6:21

06/12/2021

Bayesian Bellman Operators

Mattie Fellows, Kristian Hartikainen, Shimon Whiteson

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:54

06/12/2021

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems

Suhas Kowshik, Dheeraj Nagaraj, Prateek Jain, Praneeth Netrapalli

Keywords Paper

theory

0

0

0

0

14:43

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

06/12/2020

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

Dheeraj Nagaraj, Xian Wu, Guy Bresler and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

0

0

0

0

3:34

18/07/2021

Continuous-time Model-based Reinforcement Learning

Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:00

06/12/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and
Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:48

06/12/2020

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

0

0

0

0

3:12

06/12/2020

Training Generative Adversarial Networks by Solving Ordinary Differential Equations

Chongli Qin, Yan Wu, Jost Tobias Springenberg and
Andy Brock, Jeff Donahue, Timothy Lillicrap, Pushmeet Kohli

Keywords Paper

0

0

0

0

3:20

06/12/2021

Bellman-consistent Pessimism for Offline Reinforcement Learning

Tengyang Xie, Ching-An Cheng, Nan Jiang and
Paul Mineiro, Alekh Agarwal

Keywords Paper

theory, reinforcement learning and planning, robustness

0

0

0

0

17:42

06/12/2020

Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control

Giorgos Mamakoukas, Orest Xherija, Todd Murphey

Keywords Paper

Optimization -> Non-Convex Optimization, Optimization -> Stochastic Optimization

0

0

0

0

3:13

02/02/2021

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Junhong Shen, Lin F. Yang

Keywords Paper

0

0

0

0

19:12

12/07/2020

Responsive Safety in Reinforcement Learning

Adam Stooke, Joshua Achiam, Pieter Abbeel

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:36

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

18/07/2021

Value Iteration in Continuous Actions, States and Time

Michael Lutter, Shie Mannor, Jan Peters and
Dieter Fox, Animesh Garg

Keywords Paper

Reinforcement Learning and Planning, Planning and Control

0

0

0

0

5:09

06/12/2020

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Paper

0

0

0

0

3:20

03/05/2021

Robust Overfitting may be mitigated by properly learned smoothening

Tianlong Chen, Zhenyu Zhang, Sijia Liu and
Shiyu Chang, Zhangyang Wang

Keywords Paper

Robust Overfitting, Adversarial Training, Adversarial Robustness

0

0

0

0

4:33

06/12/2020

Softmax Deep Double Deterministic Policy Gradients

Ling Pan, Qingpeng Cai, Longbo Huang

Keywords Paper

0

0

0

0

3:23

06/12/2021

Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning

Andrea Zanette, Martin J Wainwright, Emma Brunskill

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:28

06/12/2021

Boosting with Multiple Sources

Corinna Cortes, Mehryar Mohri, Dmitry Storcheus, Ananda Theertha Suresh

Keywords Paper

federated learning

0

0

0

0

13:17

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

06/12/2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

0

0

0

0

3:42

06/12/2021

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning

Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima and
Yutaka Matsuo, Shixiang (Shane) Gu

Keywords Paper

reinforcement learning and planning

0

0

0

0

10:00

06/12/2021

Post-Training Quantization for Vision Transformer

Zhenhua Liu, Yunhe Wang, Kai Han and
Wei Zhang, Siwei Ma, Wen Gao

Keywords Paper

deep learning, transformers, vision

0

0

0

0

5:52

02/02/2021

Addressing Action Oscillations through Learning Policy Inertia

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

0

0

0

0

14:57

18/07/2021

DORO: Distributional and Outlier Robust Optimization

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

Keywords Paper

Probabilistic Methods, Robust statistics

0

0

0

1

5:06

06/12/2021

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Maura Pintor, Fabio Roli, Wieland Brendel, Battista Biggio

Keywords Paper

optimization, machine learning, robustness, adversarial robustness and security, vision

0

0

0

0

11:35

18/07/2021

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

Xu Cai, Jonathan Scarlett

Keywords Paper

Applications, Natural Language Processing, Applications, Network Analysis, Reinforcement Learning and Planning, Bandits

0

0

0

0

4:19

14/06/2020

A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection

Yongri Piao, Zhengkun Rong, Miao Zhang and
Weisong Ren, Huchuan Lu

Keywords Paper

rgb-d, salient object dection, knowledge distillation, attention, computer vision, cnn

0

0

0

0

1:00

26/08/2020

Bandit optimisation of functions in the Mat\'ern kernel RKHS

David Janz, David Burt, Javier Gonzalez

Keywords Paper

0

0

0

0

14:50

06/12/2021

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Fan Bao, Guoqiang Wu, Chongxuan LI and
Jun Zhu, Bo Zhang

Keywords Paper

optimization

0

0

0

0

8:58

14/06/2020

Forward and Backward Information Retention for Accurate Binary Neural Networks

Haotong Qin, Ruihao Gong, Xianglong Liu and
Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Paper

model compression, binary neural networks, deep learning, quantization, computer vision

0

0

0

0

1:00

18/07/2021

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu

Keywords Paper

Reinforcement Learning and Planning

0

0

0

1

5:54

09/07/2020

Estimating Principal Components under Adversarial Perturbations

Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

Unsupervised and semi-supervised learning, Adversarial learning and robustness

0

0

0

0

15:40

02/02/2021

Deep Recurrent Belief Propagation Network for POMDPs

Yuhui Wang, Xiaoyang Tan

Keywords Paper

0

0

0

0

15:15