First Order Constrained Optimization in Policy Space

06/12/2020

First Order Constrained Optimization in Policy Space

Yiming Zhang, Quan Vuong, Keith Ross

Keywords:

Abstract Paper Similar Papers

Abstract: In reinforcement learning, an agent attempts to learn high-performing behaviors through interacting with the environment, such behaviors are often quantified in the form of a reward function. However some aspects of behavior—such as ones which are deemed unsafe and to be avoided—are best captured through constraints. We propose a novel approach called First Order Constrained Optimization in Policy Space (FOCOPS) which maximizes an agent's overall reward while ensuring the agent satisfies a set of cost constraints. Using data generated from the current policy, FOCOPS first finds the optimal update policy by solving a constrained optimization problem in the nonparameterized policy space. FOCOPS then projects the update policy back into the parametric policy space. Our approach has an approximate upper bound for worst-case constraint violation throughout training and is first-order in nature therefore simple to implement. We provide empirical evidence that our simple approach achieves better performance on a set of constrained robotics locomotive tasks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

0

0

0

0

5:20

12/07/2020

Constrained Markov Decision Processes via Backward Value Functions

Harsh Satija, Philip Amortila, Joelle Pineau

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:40

03/05/2021

Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers

Ben Eysenbach, Shreyas Chaudhari, Swapnil Asawa and
Sergey Levine, Ruslan Salakhutdinov

Keywords Paper

reinforcement learning, domain adaptation, transfer learning

0

0

0

0

4:31

18/07/2021

Interaction-Grounded Learning

Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:12

18/07/2021

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

Tengyu Xu, Yingbin LIANG, Guanghui Lan

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:15

06/12/2021

Outcome-Driven Reinforcement Learning via Variational Inference

Tim G. J. Rudner, Vitchyr Pong, Rowan McAllister and
Yarin Gal, Sergey Levine

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

12:21

18/07/2021

Inverse Constrained Reinforcement Learning

Shehryar Malik, Usman Anwar, Alireza Aghasi, Ali Ahmed

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:30

26/08/2020

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

Keywords Paper

0

0

0

0

13:49

06/12/2021

Safe Reinforcement Learning with Natural Language Constraints

Tsung-Yen Yang, Michael Y Hu, Yinlam Chow and
Peter J Ramadge, Karthik Narasimhan

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:32

06/12/2021

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Briti Gangopadhyay, Pallab Dasgupta

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

12:49

06/12/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and
Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:48

19/08/2021

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Yongshuai Liu, Avishai Halev, Xin Liu

Keywords Paper

Machine learning, General, General, General

0

0

0

0

15:25

12/07/2020

Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning

Zhaohan Guo, Bernardo Avila Pires, Mohammad Gheshlaghi Azar and
Bilal Piot, Florent Altché, Jean-Bastien Grill, Remi Munos

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

12:47

06/12/2021

Discovery of Options via Meta-Learned Subgoals

Vivek Veeriah, Tom Zahavy, Matteo Hessel and
Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

4:13

16/11/2020

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments

Jun Yamada, Youngwoon Lee, Gautam Salhotra and
Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Paper

0

0

0

0

4:59

18/07/2021

Task-Optimal Exploration in Linear Dynamical Systems

Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

19:33

16/11/2020

PLAS: Latent Action Space for Offline Reinforcement Learning

Wenxuan Zhou, Sujay Bajracharya, David Held

Keywords Paper

0

0

0

0

5:06

26/04/2020

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Dexter R.R. Scobee, S. Shankar Sastry

Keywords Paper

learning from demonstration, inverse reinforcement learning, constraint inference

0

0

0

0

5:19

18/07/2021

Density Constrained Reinforcement Learning

Zengyi Qin, Yuxiao Chen, Chuchu Fan

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:50

03/05/2021

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and
Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Paper

Safe exploration, Reinforcement Learning

0

0

0

0

5:14

12/07/2020

Responsive Safety in Reinforcement Learning

Adam Stooke, Joshua Achiam, Pieter Abbeel

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:36

02/02/2021

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Hang Zhao, Qijin She, Chenyang Zhu and
Yin Yang, Kai Xu

Keywords Paper

0

0

0

0

17:42

02/02/2021

WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning

Qisong Yang, Thiago D. Simão, Simon H Tindemans, Matthijs T. J. Spaan

Keywords Paper

0

0

0

0

17:28

18/07/2021

Learning Routines for Effective Off-Policy Reinforcement Learning

Edoardo Cetin, Oya Celiktutan

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:17

02/02/2021

Advice-Guided Reinforcement Learning in a non-Markovian Environment

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper

0

0

0

0

18:07

26/04/2020

Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning

Noah Siegel, Jost Tobias Springenberg, Felix Berkenkamp and
Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin Riedmiller

Keywords Paper

Reinforcement Learning, Off-policy, Multitask, Continuous Control

0

0

0

0

5:04

26/08/2020

Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions

Giorgia Ramponi, Amarildo Likmeta, Alberto Maria Metelli and
Andrea Tirinzoni, Marcello Restelli

Keywords Paper

0

0

0

0

9:41

02/02/2021

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

Yuqian Jiang, Suda Bharadwaj, Bo Wu and
Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Paper

0

0

0

0

15:40

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

06/12/2020

The Value Equivalence Principle for Model-Based Reinforcement Learning

Christopher Grimm, Andre Barreto, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:19

06/12/2021

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Paper

reinforcement learning and planning

0

0

0

0

12:27

02/02/2021

Right for Better Reasons: Training Differentiable Models by Constraining their Influence Functions

Xiaoting Shao, Arseny Skryagin, Wolfgang Stammer and
Patrick Schramowski, Kristian Kersting

Keywords Paper

0

0

0

0

19:08

13/04/2021

Provably efficient safe exploration via primal-dual policy optimization

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

0

0

0

0

3:07

03/05/2021

HyperDynamics: Meta-Learning Object and Agent Dynamics with Hypernetworks

Zhou Xian, Shamit Lal, Hsiao-Yu Tung and
Anthony Platanios, Katerina Fragkiadaki

Keywords Paper

0

0

0

0

5:46

12/07/2020

What Can Learned Intrinsic Rewards Capture?

Zeyu Zheng, Junhyuk Oh, Matteo Hessel and
Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:47

26/10/2020

Symbolic Plans as High-Level Instructions for Reinforcement Learning

León Illanes, Xi Yan, Rodrigo Toro Icarte, Sheila A. McIlraith

Keywords Paper

Planning, Reinforcement Learning, Sparse rewards, Sample efficiency, High-level instructions

0

0

0

0

9:06

26/04/2020

Dynamics-Aware Unsupervised Skill Discovery

Archit Sharma, Shixiang Gu, Sergey Levine and
Vikash Kumar, Karol Hausman

Keywords Paper

reinforcement learning, unsupervised learning, model-based learning, deep learning, hierarchical reinforcement learning

0

0

0

0

15:15

18/07/2021

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

Dong Ki Kim, Miao Liu, Matthew Riemer and
Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan How

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL, Algorithms, Representation Learning, Algorithms, Relational Learning

0

0

0

0

5:20

16/11/2020

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Paper

0

0

0

0

5:04

26/04/2020

Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information

Yichi Zhou, Jialian Li, Jun Zhu

Keywords Paper

0

0

0

0

12:55