Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

18/07/2021

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

Johan Björck, Xiangyu Chen, Christopher De Sa, Carla Gomes, Kilian Weinberger

Keywords: Reinforcement Learning and Planning

Abstract Paper Similar Papers

Abstract: Low-precision training has become a popular approach to reduce compute requirements, memory footprint, and energy consumption in supervised learning. In contrast, this promising approach has not yet enjoyed similarly widespread adoption within the reinforcement learning (RL) community, partly because RL agents can be notoriously hard to train even in full precision. In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a na\"ive adaptation of low-precision methods from supervised learning fails. We propose a set of six modifications, all straightforward to implement, that leaves the underlying agent and its hyperparameters unchanged but improves the numerical stability dramatically. The resulting modified SAC agent has lower memory and compute requirements while matching full-precision rewards, demonstrating that low-precision training can substantially accelerate state-of-the-art RL without parameter tuning.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

Yue Wu, Shuangfei Zhai, Nitish Srivastava and
Josh Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:01

12/07/2020

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods

Dan Fu, Mayee Chen, Frederic Sala and
Sarah Hooper, Kayvon Fatahalian, Christopher Re

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

15:01

18/07/2021

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Yuda Song, Wen Sun

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

06/12/2021

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Yiqin Yang, Xiaoteng Ma, Li Chenghao and
Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:36

20/07/2020

DP-LSSGD: A Stochastic Optimization Method to Lift the Utility in Privacy-Preserving ERM

Bao Wang, Quanquan Gu, March Boedihardjo and
Lingxiao Wang, Farzin Barekat, Stanley J. Osher

Keywords Paper

0

0

0

0

17:42

03/05/2021

Deconstructing the Regularization of BatchNorm

Yann Dauphin, Ekin Cubuk

Keywords Paper

understanding neural networks, batch normalization, regularization, deep learning

0

0

0

0

5:09

06/12/2020

Improving Generalization in Reinforcement Learning with Mixture Regularization

KAIXIN WANG, Bingyi Kang, Jie Shao, Jiashi Feng

Keywords Paper

0

0

0

1

3:14

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

18/07/2021

Active Testing: Sample-Efficient Model Evaluation

Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

Keywords Paper

Algorithms, Active Learning

0

0

0

0

5:19

03/05/2021

Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online

Yangchen Pan, Kirby Banman, Martha White

Keywords Paper

natural sparsity, Reinforcement learning, fuzzy tiling activation function, sparse representation

0

0

0

1

6:22

06/12/2021

Towards Deeper Deep Reinforcement Learning with Spectral Normalization

Nils Bjorck, Carla Gomes, Kilian Weinberger

Keywords Paper

reinforcement learning and planning, vision, language

0

0

0

0

9:28

18/07/2021

Offline Meta-Reinforcement Learning with Advantage Weighting

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

0

0

0

5:08

02/02/2021

Few-Shot Lifelong Learning

Pratik Mazumder, Pravendra Singh, Piyush Rai

Keywords Paper

0

0

0

0

18:14

18/07/2021

Accurate Post Training Quantization With Small Calibration Sets

Itay Hubara, Yury Nahshan, Yair Hanani and
Ron Banner, Daniel Soudry

Keywords Paper

Algorithms, AutoML

0

0

0

0

5:16

06/12/2020

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them

Chen Liu, Mathieu Salzmann, Tao Lin and
Ryota Tomioka, Sabine Süsstrunk

Keywords Paper

Algorithms -> Representation Learning, Applications -> Dialog- or Communication-Based Learning

0

0

0

0

3:29

06/12/2021

Regularized Softmax Deep Multi-Agent Q-Learning

Ling Pan, Tabish Rashid, Bei Peng and
Longbo Huang, Shimon Whiteson

Keywords Paper

reinforcement learning and planning

0

0

0

0

10:58

06/12/2021

Statistically and Computationally Efficient Linear Meta-representation Learning

Kiran Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

Keywords Paper

optimization, meta learning, representation learning, few shot learning

1

0

0

1

12:56

03/05/2021

Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions

Yunwen Lei, Yiming Ying

Keywords Paper

generalization bounds, non-convex learning

0

0

0

0

5:09

06/12/2020

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

Jaehyung Kim, Youngbum Hur, Sejun Park and
Eunho Yang, Sung Ju Hwang, Jinwoo Shin

Keywords Paper

0

0

0

0

3:21

12/07/2020

Structured Prediction with Partial Labelling through the Infimum Loss

Vivien Cabannnes, Francis Bach, Alessandro Rudi

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

13:01

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25

18/07/2021

Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and
Ethan Dyer, Jascha Sohl-Dickstein

Keywords Paper

Optimization, Probabilistic Methods, Topic Models, Probabilistic Methods, Latent Variable Models

0

0

0

0

5:17

02/02/2021

Learning from Noisy Labels with Complementary Loss Functions

Deng-Bao Wang, Yong Wen, Lujia Pan, Min-Ling Zhang

Keywords Paper

0

0

0

0

14:00

12/07/2020

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

Sai Praneeth Reddy Karimireddy, Satyen Kale, Mehryar Mohri and
Sashank Jakkam Reddi, Sebastian Stich, Ananda Theertha Suresh

Keywords Paper

Optimization - Convex

1

1

0

1

14:57

03/05/2021

CPT: Efficient Deep Neural Network Training via Cyclic Precision

Yonggan Fu, Han Guo, Meng Li and
Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Keywords Paper

low precision training, Efficient training

0

0

0

0

8:55

02/02/2021

Tempered Sigmoid Activations for Deep Learning with Differential Privacy

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and
Steve Chien, Úlfar Erlingsson

Keywords Paper

0

0

0

0

15:38

03/05/2021

RMSprop converges with proper hyper-parameter

Naichen Shi, Dawei Li, Mingyi Hong, Ruoyu Sun

Keywords Paper

convergence, hyperparameter, RMSprop

0

0

0

0

10:12

06/12/2020

Curriculum Learning by Dynamic Instance Hardness

Tianyi Zhou, Shengjie Wang, Jeff A Bilmes

Keywords Paper

0

0

0

0

3:24

03/05/2021

Efficient Empowerment Estimation for Unsupervised Stabilization

Ruihan Zhao, Kevin Lu, Pieter Abbeel, Stas Tiomkin

Keywords Paper

neural networks, empowerment, representation of dynamical systems, unsupervised stabilization, intrinsic motivation

0

0

0

0

5:11

06/12/2020

Temporal Variability in Implicit Online Learning

Nicolò Campolongo, Francesco Orabona

Keywords Paper

1

1

0

1

3:11

03/05/2021

Learning to Reach Goals via Iterated Supervised Learning

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and
Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Paper

goal reaching, reinforcement learning, goal-conditioned RL, behavior cloning

0

0

0

0

15:19

06/12/2021

SWAD: Domain Generalization by Seeking Flat Minima

Junbum Cha, Sanghyuk Chun, Kyungjae Lee and
Han-Cheol Cho, Seunghyun Park, Yunsung Lee, Sungrae Park

Keywords Paper

robustness, domain adaptation

0

0

0

0

11:44

06/12/2020

Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards

Yijie Guo, Jongwook Choi, Marcin Moczulski and
Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Keywords Paper

0

0

1

1

3:30

03/05/2021

Class Normalization for (Continual)? Generalized Zero-Shot Learning

Ivan Skorokhodov, Mohamed Elhoseiny

Keywords Paper

initialization, normalization, zero-shot learning, continual learning

0

0

0

0

4:45

06/12/2020

SuperLoss: A Generic Loss for Robust Curriculum Learning

Thibault Castells, Philippe Weinzaepfel, Jerome Revaud

Keywords Paper

, Probabilistic Methods -> MCMC

0

0

0

0

3:26

06/12/2020

Margins are Insufficient for Explaining Gradient Boosting

Allan Grønlund, Lior Kamma, Kasper Green Larsen

Keywords Paper

0

0

0

0

3:22

06/12/2020

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

0

0

0

0

3:29

06/12/2020

Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples

Samarth Sinha, Zhengli Zhao, Anirudh Goyal ALIAS PARTH GOYAL and
Colin A Raffel, Augustus Odena

Keywords Paper

0

0

0

0

3:20

14/06/2020

Improved Few-Shot Visual Classification

Peyman Bateni, Raghav Goyal, Vaden Masrani and
Frank Wood, Leonid Sigal

Keywords Paper

meta-learning, few-shot classification, transfer learning, mahalanobis metric, bergman divergences

0

0

0

0

1:01

17/08/2020

Learning temporal coherence via self-supervision for GAN-based video generation

Mengyu Chu, You Xie, Jonas Mayer and
Laura Leal-Taixé, Nils Thuerey

Keywords Paper

self-supervision, temporal cycle-consistency, video super-resolution, generative adversarial network, unpaired video translation

0

0

0

0

16:59