Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

16/11/2020

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Tanmay Gangwani, Jian Peng, Yuan Zhou

Keywords:

Abstract Paper Code Similar Papers

Abstract: Quality-Diversity (QD) is a concept from Neuroevolution with some intriguing applications to Reinforcement Learning. It facilitates learning a population of agents where each member is optimized to simultaneously accumulate high task-returns and exhibit behavioral diversity compared to other members. In this paper, we build on a recent kernel-based method for training a QD policy ensemble with Stein variational gradient descent. With kernels based on f-divergence between the stationary distributions of policies, we convert the problem to that of efficient estimation of the ratio of these stationary distributions. We then study various distribution ratio estimators used previously for off-policy evaluation and imitation and re-purpose them to compute the gradients for policies in an ensemble such that the resultant population is diverse and of high-quality.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CoRL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand

Keywords Paper

Model-based reinforcement learning, search-control, Dyna, frequency of a signal

0

0

0

0

4:32

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

26/04/2020

Multi-Agent Interactions Modeling with Correlated Policies

Minghuan Liu, Ming Zhou, Weinan Zhang and
Yuzheng Zhuang, Jun Wang, Wulong Liu, Yong Yu

Keywords Paper

Multi-agent reinforcement learning, Imitation learning

0

0

0

0

4:33

06/12/2021

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation

Yunhao Tang, Tadashi Kozuno, Mark Rowland and
Remi Munos, Michal Valko

Keywords Paper

reinforcement learning and planning, meta learning

0

0

0

0

12:45

12/07/2020

Learning to Score Behaviors for Guided Policy Optimization

Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang and
Krzysztof Choromanski, Anna Choromanska, Michael Jordan

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:10

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

06/12/2021

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Alireza Fallah, Kristian Georgiev, Aryan Mokhtari, Asuman Ozdaglar

Keywords Paper

theory, optimization, reinforcement learning and planning, meta learning

0

1

1

0

12:25

06/12/2021

Variance-Aware Off-Policy Evaluation with Linear Function Approximation

Yifei Min, Tianhao Wang, Dongruo Zhou, Quanquan Gu

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

12:17

06/12/2020

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Guy Lorberbom, Chris J. Maddison, Nicolas Heess and
Tamir Hazan, Daniel Tarlow

Keywords Paper

0

0

0

0

3:16

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

02/02/2021

Bayesian Distributional Policy Gradients

Luchen Li, A. Aldo Faisal

Keywords Paper

1

0

0

0

18:06

26/04/2020

RaCT: Toward Amortized Ranking-Critical Training For Collaborative Filtering

Sam Lobel, Chunyuan Li, Jianfeng Gao, Lawrence Carin

Keywords Paper

Collaborative Filtering, Recommender Systems, Actor-Critic, Learned Metrics

0

0

0

0

5:27

26/04/2020

Bayesian Meta Sampling for Fast Uncertainty Adaptation

Zhenyi Wang, Yang Zhao, Ping Yu and
Ruiyi Zhang, Changyou Chen

Keywords Paper

Bayesian Sampling, Uncertainty Adaptation, Meta Learning, Variational Inference

0

0

0

0

4:44

12/07/2020

Stochastically Dominant Distributional Reinforcement Learning

John Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot

Keywords Paper

Trustworthy Machine Learning

0

0

0

0

12:18

03/05/2021

Set Prediction without Imposing Structure as Conditional Density Estimation

David W Zhang, Gertjan J Burghouts, Cees G Snoek

Keywords Paper

energy based models, set prediction

0

0

0

0

5:02

06/12/2021

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Matthias Schultheis, Dominik Straub, Constantin Rothkopf

Keywords Paper

0

0

0

0

9:29

06/12/2021

Loss function based second-order Jensen inequality and its application to particle variational inference

Futoshi Futami, Tomoharu Iwata, naonori ueda and
Issei Sato, Masashi Sugiyama

Keywords Paper

optimization, generative model

0

0

0

0

14:09

18/07/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:49

26/04/2020

Meta-Q-Learning

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

Keywords Paper

meta reinforcement learning, propensity estimation, off-policy

0

0

0

0

15:50

18/07/2021

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

Shariq Iqbal, Christian Schroeder, Bei Peng and
Wendelin Boehmer, Shimon Whiteson, Fei Sha

Keywords Paper

Optimization, Convex Optimization, Reinforcement Learning and Planning, Multi-Agent RL, Algorithms, Large Scale Learning; Probabilistic Methods, Distributed Inference

0

0

0

0

20:08

06/12/2020

Sharper Generalization Bounds for Pairwise Learning

Yunwen Lei, Antoine Ledent, Marius Kloft

Keywords Paper

0

0

0

0

3:20

18/07/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:04

18/07/2021

Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time

Weichen Wang, Jiequn Han, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Theory

0

0

0

0

5:03

19/08/2021

Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation

Yue Guan, Qifan Zhang, Panagiotis Tsiotras

Keywords Paper

Machine Learning, Reinforcement Learning, Multi-agent Learning, Noncooperative Games

0

0

0

0

12:26

18/07/2021

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

06/12/2020

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Paper

0

0

0

0

3:31

26/08/2020

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

0

0

0

0

17:07

18/07/2021

Annealed Flow Transport Monte Carlo

Michael Arbel, Alexander Matthews, Arnaud Doucet

Keywords Paper

Probabilistic Methods, Monte Carlo Methods

0

0

0

0

17:28

26/04/2020

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

Shell Xu Hu, Pablo Moreno, Yang Xiao and
Xi Shen, Guillaume Obozinski, Neil Lawrence, Andreas Damianou

Keywords Paper

Meta-learning, Empirical Bayes, Synthetic Gradient, Information Bottleneck

0

0

0

0

4:47

12/07/2020

Data Valuation using Reinforcement Learning

Jinsung Yoon, Sercan Arik, Tomas Pfister

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:35

06/12/2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty

Kaiqing Zhang, TAO SUN, Yunzhe Tao and
Sahika Genc, Sunil Mallya, Tamer Basar

Keywords Paper

0

0

0

0

3:11

06/12/2021

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Paper

optimization, reinforcement learning and planning, interpretability

0

0

0

0

4:10

02/02/2021

The Value-Improvement Path: Towards Better Representations for Reinforcement Learning

Will Dabney, André Barreto, Mark Rowland and
Robert Dadashi, John Quan, Marc G. Bellemare, David Silver

Keywords Paper

0

0

0

0

20:06

03/05/2021

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Atsushi Nitanda, Taiji Suzuki

Keywords Paper

stochastic gradient descent, neural tangent kernel, over-parameterization, two-layer neural network

0

0

0

0

18:48

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

13:45

06/12/2021

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

15:07

03/05/2021

C-Learning: Learning to Achieve Goals via Recursive Classification

Ben Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Keywords Paper

reinforcement learning, goal reaching, density estimation, hindsight relabeling, Q-learning

0

0

0

0

5:09

18/07/2021

A Wasserstein Minimax Framework for Mixed Linear Regression

Theo Diamandis, Yonina Eldar, Alireza Fallah and
Farzan Farnia, Asuman Ozdaglar

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

25:41