Learning to Score Behaviors for Guided Policy Optimization

12/07/2020

Learning to Score Behaviors for Guided Policy Optimization

Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Krzysztof Choromanski, Anna Choromanska, Michael Jordan

Keywords: Reinforcement Learning - General

Abstract Paper Similar Papers

Abstract: We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space. We show that by utilizing the dual formulation of the WD, we can learn score functions over policy behaviors that can in turn be used to lead policy optimization towards (or away from) (un)desired behaviors. Combined with smoothed WDs, the dual formulation allows us to devise efficient algorithms that take stochastic gradient descent steps through WD regularizers. We incorporate these regularizers into two novel on-policy algorithms, Behavior-Guided Policy Gradient and Behavior-Guided Evolution Strategies, which we demonstrate can outperform existing methods in a variety of challenging environments. We also provide an open source demo.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

26/08/2020

Variance Reduction for Evolution Strategies via Structured Control Variates

Yunhao Tang, Krzysztof Choromanski, Alp Kucukelbir

Keywords Paper

0

0

0

0

13:37

06/12/2020

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Guy Lorberbom, Chris J. Maddison, Nicolas Heess and
Tamir Hazan, Daniel Tarlow

Keywords Paper

0

0

0

0

3:16

06/12/2021

Model Selection for Bayesian Autoencoders

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and
Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Paper

optimization, self-supervised learning, generative model, representation learning

0

0

0

0

10:49

18/07/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:49

26/08/2020

Practical Nonisotropic Monte Carlo Sampling in High Dimensions via Determinantal Point Processes

Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang

Keywords Paper

0

0

0

0

12:42

16/11/2020

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Tanmay Gangwani, Jian Peng, Yuan Zhou

Keywords Paper

0

0

0

0

4:27

06/12/2020

Deep Rao-Blackwellised Particle Filters for Time Series Forecasting

Richard Kurle, Syama Sundar Rangapuram, Emmanuel de Bézenac and
Stephan Günnemann, Jan Gasthaus

Keywords Paper

0

0

0

0

3:14

06/12/2021

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation

Yunhao Tang, Tadashi Kozuno, Mark Rowland and
Remi Munos, Michal Valko

Keywords Paper

reinforcement learning and planning, meta learning

0

0

0

0

12:45

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

03/08/2020

No-regret Exploration in Contextual Reinforcement Learning

Aditya Modi, Ambuj Tewari

Keywords Paper

0

0

0

0

8:19

26/04/2020

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

Shell Xu Hu, Pablo Moreno, Yang Xiao and
Xi Shen, Guillaume Obozinski, Neil Lawrence, Andreas Damianou

Keywords Paper

Meta-learning, Empirical Bayes, Synthetic Gradient, Information Bottleneck

0

0

0

0

4:47

19/08/2021

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

Menghui Zhu, Minghuan Liu, Jian Shen and
Zhicheng Zhang, Sheng Chen, Weinan Zhang, Deheng Ye, Yong Yu, Qiang Fu, Wei Yang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

0

0

0

0

11:28

12/07/2020

Margin-aware Adversarial Domain Adaptation with Optimal Transport

Sofien Dhouib, Ievgen Redko, Carole Lartizien

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

1

14:11

19/08/2021

On Guaranteed Optimal Robust Explanations for NLP Models

Emanuele La Malfa, Rhiannon Michelmore, Agnieszka M. Zbrzezny and
Nicola Paoletti, Marta Kwiatkowska

Keywords Paper

Machine Learning, Adversarial Machine Learning, Explainable/Interpretable Machine Learning, Sentiment Analysis and Text Mining

0

0

0

0

14:52

06/12/2021

Loss function based second-order Jensen inequality and its application to particle variational inference

Futoshi Futami, Tomoharu Iwata, naonori ueda and
Issei Sato, Masashi Sugiyama

Keywords Paper

optimization, generative model

0

0

0

0

14:09

12/07/2020

Provably Efficient Model-based Policy Adaptation

Yuda Song, Aditi Mavalankar, Wen Sun, Sicun Gao

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:49

03/05/2021

Learning Robust State Abstractions for Hidden-Parameter Block MDPs

Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau

Keywords Paper

bisimulation, block mdp, hidden-parameter mdp, multi-task reinforcement learning

0

0

0

0

4:17

18/07/2021

Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction

Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn and
Noboru Harada, Hiroshi Sawada, Akinori Fujino

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:41

26/08/2020

Discrete Action On-Policy Learning with Action-Value Critic

Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou

Keywords Paper

0

0

0

0

14:23

26/04/2020

Meta-Q-Learning

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

Keywords Paper

meta reinforcement learning, propensity estimation, off-policy

0

0

0

0

15:50

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

12/07/2020

Structured Policy Iteration for Linear Quadratic Regulator

Youngsuk Park, Ryan Rossi, Zheng Wen and
Gang Wu, Handong Zhao

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:08

12/07/2020

Stochastically Dominant Distributional Reinforcement Learning

John Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot

Keywords Paper

Trustworthy Machine Learning

0

0

0

0

12:18

12/07/2020

Adaptive Region-Based Active Learning

Corinna Cortes, Giulia DeSalvo, Claudio Gentile and
Mehryar Mohri, Ningshan Zhang

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:41

03/05/2021

Efficient Wasserstein Natural Gradients for Reinforcement Learning

Ted Moskovitz, Michael Arbel, Ferenc Huszar, Arthur Gretton

Keywords Paper

reinforcement learning, optimization

0

0

0

0

4:28

03/05/2021

Set Prediction without Imposing Structure as Conditional Density Estimation

David W Zhang, Gertjan J Burghouts, Cees G Snoek

Keywords Paper

energy based models, set prediction

0

0

0

0

5:02

26/08/2020

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

0

0

0

0

17:07

26/04/2020

Bayesian Meta Sampling for Fast Uncertainty Adaptation

Zhenyi Wang, Yang Zhao, Ping Yu and
Ruiyi Zhang, Changyou Chen

Keywords Paper

Bayesian Sampling, Uncertainty Adaptation, Meta Learning, Variational Inference

0

0

0

0

4:44

14/06/2020

Enhanced Transport Distance for Unsupervised Domain Adaptation

Mengxue Li, Yi-Ming Zhai, You-Wei Luo and
Peng-Fei Ge, Chuan-Xian Ren

Keywords Paper

uda, optimal transport, neural networks, attention mechanism, kantorovich potential

0

0

0

0

0:58

26/08/2020

Deep Active Learning: Unified and Principled Method for Query and Training

Changjian Shui, Fan Zhou, Christian Gagné, Boyu Wang

Keywords Paper

0

0

0

0

12:12

06/12/2021

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

14:35

18/07/2021

DriftSurf: Stable-State / Reactive-State Learning under Concept Drift

Ashraf Tahmasbi, Ellango Jothimurugesan, Srikanta Tirthapura, Phil Gibbons

Keywords Paper

Algorithms, Online Learning Algorithms

0

0

0

0

5:07

18/07/2021

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi, Archit Sharma, Honglak Lee and
Sergey Levine, Shixiang Gu

Keywords Paper

Neuroscience and Cognitive Science, Neuroscience, Reinforcement Learning and Planning, Algorithms, Representation Learning; Algorithms, Sparse Coding and Dimensionality Expansion; Applications, Matrix and Ten

0

0

0

0

5:16

26/04/2020

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand

Keywords Paper

Model-based reinforcement learning, search-control, Dyna, frequency of a signal

0

0

0

0

4:32

06/12/2020

Generalised Bayesian Filtering via Sequential Monte Carlo

Ayman Boustati, Omer Deniz Akyildiz, Theo Damoulas, Adam Johansen

Keywords Paper

0

0

0

0

2:52

06/12/2021

Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators

Qitian Wu, Rui Gao, Hongyuan Zha

Keywords Paper

generative model

0

0

0

0

12:51