Recomposing the Reinforcement Learning Building Blocks with Hypernetworks

18/07/2021

Recomposing the Reinforcement Learning Building Blocks with Hypernetworks

Elad Sarafian, Shai Keynan, Sarit Kraus

Keywords: Reinforcement Learning and Planning, Deep RL

Abstract Paper Similar Papers

Abstract: The Reinforcement Learning (RL) building blocks, i.e. $Q$-functions and policy networks, usually take elements from the cartesian product of two domains as input. In particular, the input of the $Q$-function is both the state and the action, and in multi-task problems (Meta-RL) the policy can take a state and a context. Standard architectures tend to ignore these variables' underlying interpretations and simply concatenate their features into a single vector. In this work, we argue that this choice may lead to poor gradient estimation in actor-critic algorithms and high variance learning steps in Meta-RL algorithms. To consider the interaction between the input variables, we suggest using a Hypernetwork architecture where a primary network determines the weights of a conditional dynamic network. We show that this approach improves the gradient approximation and reduces the learning step variance, which both accelerates learning and improves the final performance. We demonstrate a consistent improvement across different locomotion tasks and different algorithms both in RL (TD3 and SAC) and in Meta-RL (MAML and PEARL).

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

Interference and Generalization in Temporal Difference Learning

Emmanuel Bengio, Joelle Pineau, Doina Precup

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

10:58

26/04/2020

Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation

Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou

Keywords Paper

binary softmax, discrete variables, policy gradient, pseudo actions, reinforcement learning, variance reduction

0

0

0

0

4:59

18/07/2021

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

02/02/2021

Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

Wenzhen Huang, Qiyue Yin, Junge Zhang, Kaiqi Huang

Keywords Paper

0

0

0

0

14:38

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

26/04/2020

Ranking Policy Gradient

Kaixiang Lin, Jiayu Zhou

Keywords Paper

Sample-efficient reinforcement learning, off-policy learning.

0

0

0

0

5:43

06/12/2020

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Kyunghyun Lee, Byeong-Uk Lee, Ukcheol Shin, In So Kweon

Keywords Paper

0

0

0

0

3:19

06/12/2021

Settling the Variance of Multi-Agent Policy Gradients

Jakub Grudzien Kuba, Muning Wen, Linghui Meng and
shangding gu, Haifeng Zhang, David Mguni, Jun Wang, Yaodong Yang

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

13:12

18/07/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:49

06/12/2020

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

Keywords Paper

0

0

0

0

3:34

26/04/2020

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

0

0

0

0

5:00

06/12/2021

Contrastively Disentangled Sequential Variational Autoencoder

Junwen Bai, Weiran Wang, Carla Gomes

Keywords Paper

self-supervised learning, generative model, contrastive learning, representation learning, interpretability

0

0

0

0

12:53

03/05/2021

Correcting experience replay for multi-agent communication

Sanjeevan Ahilan, Peter Dayan

Keywords Paper

multi-agent reinforcement learning, communication, experience replay, relabelling

1

0

0

0

10:31

06/12/2021

Offline Reinforcement Learning as One Big Sequence Modeling Problem

Michael Janner, Qiyang Li, Sergey Levine

Keywords Paper

reinforcement learning and planning, transformers, language

0

0

0

0

9:48

16/11/2020

Generative adversarial training of product of policies for robust and adaptive movement primitives

Emmanuel Pignat, Hakan Girgin, Sylvain Calinon

Keywords Paper

0

0

0

0

4:26

06/12/2021

Counterfactual Maximum Likelihood Estimation for Training Deep Networks

Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang

Keywords Paper

deep learning, domain adaptation, causality, language

0

0

0

0

14:07

12/07/2020

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Masatoshi Uehara, Jiawei Huang, Nan Jiang

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:20

06/12/2020

The Value Equivalence Principle for Model-Based Reinforcement Learning

Christopher Grimm, Andre Barreto, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:19

18/07/2021

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:06

06/12/2021

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

ZHENHUAN YANG, Yunwen Lei, Puyu Wang and
Tianbao Yang, Yiming Ying

Keywords Paper

optimization, machine learning, privacy

0

0

0

0

14:40

18/07/2021

A Wasserstein Minimax Framework for Mixed Linear Regression

Theo Diamandis, Yonina Eldar, Alireza Fallah and
Farzan Farnia, Asuman Ozdaglar

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

25:41

12/07/2020

Representation Learning via Adversarially-Contrastive Optimal Transport

Anoop Cherian, Shuchin Aeron

Keywords Paper

Representation Learning

0

0

0

0

14:47

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

06/12/2021

FACMAC: Factored Multi-Agent Centralised Policy Gradients

Bei Peng, Tabish Rashid, Christian Schroeder de Witt and
Pierre-Alexandre Kamienny, Philip Torr, Wendelin Boehmer, Shimon Whiteson

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:15

06/12/2020

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Paper

0

0

0

0

3:31

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

03/05/2021

Initialization and Regularization of Factorized Neural Layers

Misha Khodak, Neil Tenenholtz, Lester Mackey, Nicolo Fusi

Keywords Paper

matrix factorization, knowledge distillation, multi-head attention, model compression

0

0

0

0

4:25

26/04/2020

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and
Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Paper

deep policy gradient methods, deep reinforcement learning, trpo, ppo

0

0

0

0

20:41

06/12/2021

Time-series Generation by Contrastive Imitation

Daniel Jarrett, Ioana Bica, Mihaela van der Schaar

Keywords Paper

generative model

0

0

0

0

8:47

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

02/02/2021

Reinforcement Learning Based Multi-Agent Resilient Control: From Deep Neural Networks to an Adaptive Law

Jian Hou, Fangyuan Wang, Lili Wang, Zhiyong Chen

Keywords Paper

0

0

0

0

15:48

06/12/2021

Regularized Softmax Deep Multi-Agent Q-Learning

Ling Pan, Tabish Rashid, Bei Peng and
Longbo Huang, Shimon Whiteson

Keywords Paper

reinforcement learning and planning

0

0

0

0

10:58

26/04/2020

Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks

Jiadong Lin, Chuanbiao Song, Kun He and
Liwei Wang, John E. Hopcroft

Keywords Paper

adversarial examples, adversarial attack, transferability, Nesterov accelerated gradient, scale invariance

0

0

0

0

3:59

06/12/2021

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

13:11

12/07/2020

Representations for Stable Off-Policy Reinforcement Learning

Dibya Ghosh, Marc Bellemare

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:38

06/12/2021

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang, Weixun Wang, Hongyao Tang and
Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

Keywords Paper

reinforcement learning and planning, transfer learning

0

0

0

0

15:21

06/12/2021

Consistency Regularization for Variational Auto-Encoders

Samarth Sinha, Adji Bousso Dieng

Keywords Paper

deep learning, machine learning, self-supervised learning, generative model, contrastive learning, representation learning

0

0

0

0

10:52

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

03/05/2021

QPLEX: Duplex Dueling Multi-Agent Q-Learning

Jianhao Wang, Zhizhou Ren, Terry Liu and
Yang Yu, Chongjie Zhang

Keywords Paper

Dueling structure, Value factorization, Multi-agent reinforcement learning

0

0

0

0

4:52