Inferring DQN structure for high-dimensional continuous control

12/07/2020

Inferring DQN structure for high-dimensional continuous control

Andrey Sakryukin, Chedy Raissi, Mohan Kankanhalli

Keywords: Deep Learning - General

Abstract Paper Similar Papers

Abstract: Despite recent advancements in the field of Deep Reinforcement Learning, Deep Q-network (DQN) models still show lackluster performance on problems with high-dimensional action spaces. The problem is even more pronounced for cases with high-dimensional continuous action spaces due to a combinatorial increase in the number of the outputs. Recent works approach the problem by dividing the network into multiple parallel or sequential (action) modules responsible for different discretized actions. However, there are drawbacks to both the parallel and the sequential approaches. Parallel module architectures lack coordination between action modules, leading to extra complexity in the task, while a sequential structure can result in the vanishing gradients problem and exploding parameter space. In this work, we show that the compositional structure of the action modules has a significant impact on model performance. We propose a novel approach to infer the network structure for DQN models operating with high-dimensional continuous actions. Our method is based on the uncertainty estimation techniques introduced in the paper. Our approach achieves state-of-the-art performance on MuJoCo environments with high-dimensional continuous action spaces. Furthermore, we demonstrate the improvement of the introduced approach on a realistic AAA sailing simulator game.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:38

18/07/2021

Value Iteration in Continuous Actions, States and Time

Michael Lutter, Shie Mannor, Jan Peters and
Dieter Fox, Animesh Garg

Keywords Paper

Reinforcement Learning and Planning, Planning and Control

0

0

0

0

5:09

22/11/2021

FFNB: Forgetting-Free Neural Blocks for Deep Continual Learning

Hichem Sahbi, Haoming Zhan

Keywords Paper

Continual and incremental learning, lifelong learning, catastrophic interference, catastrophic forgetting, dynamic neural networks, visual recognition

0

0

0

0

3:05

02/02/2021

Training Spiking Neural Networks with Accumulated Spiking Flow

Hao Wu, Yueyi Zhang, Wenming Weng and
Yongting Zhang, Zhiwei Xiong, Zheng-Jun Zha, Xiaoyan Sun, Feng Wu

Keywords Paper

0

0

0

0

16:45

02/02/2021

Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

Qizhou Wang, Bo Han, Tongliang Liu and
Gang Niu, Jian Yang, Chen Gong

Keywords Paper

0

0

0

0

14:56

02/02/2021

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Paper

0

0

0

0

14:30

03/05/2021

Representation Balancing Offline Model-based Reinforcement Learning

Byung-Jun Lee, Jongmin Lee, Kee-Eung Kim

Keywords Paper

Off-policy policy evaluation, Batch Reinforcement Learning, Offline Reinforcement Learning, Model-based Reinforcement Learning, Reinforcement Learning

0

0

0

0

5:45

18/07/2021

Towards Better Robust Generalization with Shift Consistency Regularization

Shufei Zhang, Zhuang Qian, Kaizhu Huang and
Qiufeng Wang, Rui Zhang, Xinping Yi

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:44

12/07/2020

Two Routes to Scalable Credit Assignment without Weight Symmetry

Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena and
Surya Ganguli, Jonathan Bloom, Daniel Yamins

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

1

14:12

06/12/2021

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems

Wenqing Zheng, Qiangqiang Guo, Hao Yang and
Peihao Wang, Zhangyang Wang

Keywords Paper

reinforcement learning and planning, transformers

0

0

0

0

13:32

06/12/2021

Attention over Learned Object Embeddings Enables Complex Visual Reasoning

David Ding, Felix Hill, Adam Santoro and
Malcolm Reynolds, Matt Botvinick

Keywords Paper

deep learning, transformers, vision

0

0

0

0

18:51

14/06/2020

Learn2Perturb: An End-to-End Feature Perturbation Learning to Improve Adversarial Robustness

Ahmadreza Jeddi, Mohammad Javad Shafiee, Michelle Karg and
Christian Scharfenberger, Alexander Wong

Keywords Paper

adversarial robustness, network randomization, alternative back-propagation, trainable noise, adversarial training

0

0

0

0

1:01

03/05/2021

Go with the flow: Adaptive control for Neural ODEs

Mathieu Chalvidal, Matthew Ricci, Rufin VanRullen, Thomas Serre

Keywords Paper

Neural ODEs, Normalizing flows, Hypernetworks, Optimal Control Theory

0

0

0

0

5:03

06/12/2021

Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity

Kaiqing Zhang, Xiangyuan Zhang, Bin Hu, Tamer Basar

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

15:57

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

03/05/2021

Incorporating Symmetry into Deep Dynamics Models for Improved Generalization

Rui Wang, Robin Walters, Rose Yu

Keywords Paper

AI for earth science, physics-guided deep learning, equivariant neural network, deep sequence model

0

0

0

0

4:46

06/12/2021

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

14:48

02/02/2021

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Junhong Shen, Lin F. Yang

Keywords Paper

0

0

0

0

19:12

19/08/2021

Learning Deeper Non-Monotonic Networks by Softly Transferring Solution Space

Zheng-Fan Wu, Hui Xue, Weimin Bai

Keywords Paper

Machine Learning, Kernel Methods, Deep Learning, Classification

0

0

0

0

12:50

30/11/2020

Bridging Adversarial and Statistical Domain Transfer via Spectral Adaptation Networks

Christoph Raab, Philipp Väth, Peter Meier, Frank-Michael Schleif

Keywords Paper

0

0

0

0

10:07

14/06/2020

Factorized Higher-Order CNNs With an Application to Spatio-Temporal Emotion Estimation

Jean Kossaifi, Antoine Toisoul, Adrian Bulat and
Yannis Panagakis, Timothy M. Hospedales, Maja Pantic

Keywords Paper

tensor methods, deep learning, spatiotemporal, emotion, cnn, tensor decomposition, low-rank, valence, arousal

0

0

0

0

1:01

16/11/2020

Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning

Yuning Mao, Yanru Qu, Yiqing Xie and
Xiang Ren, Jiawei Han

Keywords Paper

single-document summarization, single-document sds, multi-document summarization, multi-document mds

0

0

0

0

10:58

18/07/2021

Self-Damaging Contrastive Learning

Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Keywords Paper

Algorithms, Unsupervised Learning

0

0

0

1

5:10

20/07/2020

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

Tankut Can, Kamesh Krishnamurthy, David J. Schwab

Keywords Paper

0

0

0

0

21:00

26/04/2020

Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control

Tsui-Wei Weng, Krishnamurthy (Dj) Dvijotham, Jonathan Uesato and
Kai Xiao, Sven Gowal, Robert Stanforth*, Pushmeet Kohli

Keywords Paper

deep learning, reinforcement learning, robustness, adversarial examples

0

0

0

0

6:00

03/05/2021

A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference

Sanghyun Hong, Yigitcan Kaya, Ionut-Vlad Modoranu, Tudor Dumitras

Keywords Paper

efficient inference, adversarial examples, input-adaptive multi-exit neural networks, Slowdown attacks

0

0

0

0

10:24

14/06/2020

Meta-Transfer Learning for Zero-Shot Super-Resolution

Jae Woong Soh, Sunwoo Cho, Nam Ik Cho

Keywords Paper

zero-shot super-resolution, meta learning, transfer learning

0

0

0

0

0:59

06/12/2021

Continual Learning via Local Module Composition

Oleksiy Ostapenko, Pau Rodriguez, Massimo Caccia, Laurent Charlin

Keywords Paper

continual learning, transfer learning

1

0

0

1

14:32

26/08/2020

A Nonparametric Off-Policy Policy Gradient

Samuele Tosatto, Joao Carvalho, Hany Abdulsamad, Jan Peters

Keywords Paper

0

0

0

0

12:19

06/12/2020

Improved Analysis of Clipping Algorithms for Non-convex Optimization

Bohang Zhang, Jikai Jin, Cong Fang, Liwei Wang

Keywords Paper

0

0

0

0

3:16

06/12/2020

The NetHack Learning Environment

Heinrich Küttler, Nantas Nardelli, Alexander Miller and
Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

Keywords Paper

0

0

0

0

3:14

02/02/2021

High Dimensional Level Set Estimation with Bayesian Neural Network

Huong Ha, Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Paper

0

0

0

0

19:14

06/12/2020

Goal-directed Generation of Discrete Structures with Conditional Generative Models

Amina Mollaysa, Brooks Paige, Alexandros Kalousis

Keywords Paper

0

0

0

0

3:10

02/02/2021

DIBS: Diversity Inducing Information Bottleneck in Model Ensembles

Samarth Sinha, Homanga Bharadhwaj, Anirudh Goyal and
Hugo Larochelle, Animesh Garg, Florian Shkurti

Keywords Paper

0

0

0

0

16:26

26/04/2020

Can gradient clipping mitigate label noise?

Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

Keywords Paper

0

0

0

0

4:56

06/12/2020

Convolutional Tensor-Train LSTM for Spatio-Temporal Learning

Jiahao Su, Wonmin Byeon, Jean Kossaifi and
Furong Huang, Jan Kautz, Anima Anandkumar

Keywords Paper

0

0

0

0

3:29

06/12/2021

Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization

Rohan Ghosh, Mehul Motani

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

14:07

18/07/2021

Meta-Cal: Well-controlled Post-hoc Calibration by Ranking

Xingchen Ma, Matthew B Blaschko

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

4:28

05/04/2021

RL-Scope: Cross-stack Profiling for Deep Reinforcement Learning Workloads

James Gleeson, Sri Krishnan, Moshe Gabel and
Vijay Janapa Reddi, Eyal de Lara, Gennady Pekhimenko

Keywords Paper

0

0

0

0

5:17

05/04/2021

RL-Scope: Cross-stack Profiling for Deep Reinforcement Learning Workloads

James Gleeson, Sri Krishnan, Moshe Gabel and
Vijay Janapa Reddi, Eyal de Lara, Gennady Pekhimenko

Keywords Paper

0

0

0

0

20:00