Stochastically Dominant Distributional Reinforcement Learning

12/07/2020

Stochastically Dominant Distributional Reinforcement Learning

John Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot

Keywords: Trustworthy Machine Learning

Abstract Paper Similar Papers

Abstract: We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning paradigm. Instead of selecting actions according to a single statistic, we propose a distributional method based on the second-order stochastic dominance (SSD) relation. This compares the inherent dispersion of random returns induced by actions, producing a more comprehensive and robust evaluation of the environment's uncertainty. The necessary conditions for SSD require estimators to predict quality second moments. To accommodate this, we map the distributional RL problem to a Wasserstein gradient flow, treating the distributional Bellman residual as a potential energy functional. We propose a particle-based algorithm for which we prove optimality and convergence. Our experiments characterize the algorithm performance and demonstrate how uncertainty and performance are better balanced using SSD action selection than with other risk measures.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Variance-Aware Off-Policy Evaluation with Linear Function Approximation

Yifei Min, Tianhao Wang, Dongruo Zhou, Quanquan Gu

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

12:17

06/12/2021

Loss function based second-order Jensen inequality and its application to particle variational inference

Futoshi Futami, Tomoharu Iwata, naonori ueda and
Issei Sato, Masashi Sugiyama

Keywords Paper

optimization, generative model

0

0

0

0

14:09

12/07/2020

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Masatoshi Uehara, Jiawei Huang, Nan Jiang

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:20

06/12/2021

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Matthias Schultheis, Dominik Straub, Constantin Rothkopf

Keywords Paper

0

0

0

0

9:29

18/07/2021

Annealed Flow Transport Monte Carlo

Michael Arbel, Alexander Matthews, Arnaud Doucet

Keywords Paper

Probabilistic Methods, Monte Carlo Methods

0

0

0

0

17:28

06/12/2020

On Learning Ising Models under Huber's Contamination Model

Adarsh Prasad, Vishwak Srinivasan, Sivaraman Balakrishnan, Pradeep Ravikumar

Keywords Paper

0

0

0

0

3:16

06/12/2020

Deep Rao-Blackwellised Particle Filters for Time Series Forecasting

Richard Kurle, Syama Sundar Rangapuram, Emmanuel de Bézenac and
Stephan Günnemann, Jan Gasthaus

Keywords Paper

0

0

0

0

3:14

06/12/2020

Quantized Variational Inference

Amir Dib

Keywords Paper

0

0

0

0

2:28

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

26/04/2020

Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information

Yichi Zhou, Jialian Li, Jun Zhu

Keywords Paper

0

0

0

0

12:55

13/04/2021

Learning prediction intervals for regression: Generalization and calibration

Haoxian Chen, Ziyi Huang, Henry Lam and
Huajie Qian, Haofeng Zhang

Keywords Paper

0

0

0

0

3:26

06/12/2020

Minimax Estimation of Conditional Moment Models

Nishanth Dikkala, Greg Lewis, Lester Mackey, Vasilis Syrgkanis

Keywords Paper

0

0

0

0

3:04

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28

06/12/2020

AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

Silviu-Marian Udrescu, Andrew Tan, Jiahai Feng and
Orisvaldo Neto, Tailin Wu, Max Tegmark

Keywords Paper

0

0

0

0

3:13

06/12/2021

Control Variates for Slate Off-Policy Evaluation

Nikos Vlassis, Ashok Chandrashekar, Fernando Amat, Nathan Kallus

Keywords Paper

optimization, bandits

0

0

0

0

12:25

26/04/2020

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand

Keywords Paper

Model-based reinforcement learning, search-control, Dyna, frequency of a signal

0

0

0

0

4:32

18/07/2021

Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models

Fan Bao, Taufik Xu, Chongxuan Li and
Lanqing Hong, Jun Zhu, Bo Zhang

Keywords Paper

Deep Learning, Applications, Computer Vision, Algorithms, Image Segmentation; Algorithms, Similarity and Distance Learning; Algorithms, Spectral Methods; Applications

0

0

0

0

4:42

12/07/2020

Learning to Score Behaviors for Guided Policy Optimization

Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang and
Krzysztof Choromanski, Anna Choromanska, Michael Jordan

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:10

03/05/2021

C-Learning: Learning to Achieve Goals via Recursive Classification

Ben Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Keywords Paper

reinforcement learning, goal reaching, density estimation, hindsight relabeling, Q-learning

0

0

0

0

5:09

16/11/2020

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Tanmay Gangwani, Jian Peng, Yuan Zhou

Keywords Paper

0

0

0

0

4:27

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

18/07/2021

Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting

Kashif Rasul, Calvin Seward, Ingmar Schuster, Roland Vollgraf

Keywords Paper

Algorithms, Time Series and Sequences

0

0

0

0

5:46

03/05/2021

When does preconditioning help or hurt generalization?

Shun-ichi Amari, Jimmy Ba, Roger Grosse and
Chen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu

Keywords Paper

high-dimensional asymptotics, generalization, second-order optimization, natural gradient descent

0

0

0

0

5:21

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

18/07/2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction

Chenjia Bai, Lingxiao Wang, Lei Han and
Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:18

03/05/2021

Set Prediction without Imposing Structure as Conditional Density Estimation

David W Zhang, Gertjan J Burghouts, Cees G Snoek

Keywords Paper

energy based models, set prediction

0

0

0

0

5:02

06/12/2021

Model Selection for Bayesian Autoencoders

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and
Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Paper

optimization, self-supervised learning, generative model, representation learning

0

0

0

0

10:49

06/12/2021

Risk-Averse Bayes-Adaptive Reinforcement Learning

Marc Rigter, Bruno Lacerda, Nick Hawes

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:27

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

06/12/2021

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Paper

optimization, reinforcement learning and planning, interpretability

0

0

0

0

4:10

09/07/2020

On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels

Tengyuan Liang, Alexander Rakhlin, Xiyu Zhai

Keywords Paper

Supervised learning, Excess risk bounds and generalization error bounds, High-dimensional statistics, Kernel methods, Regression

0

0

0

0

14:56

03/05/2021

Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering

Calypso Herrera, Florian Krach, Josef Teichmann

Keywords Paper

irregular-observed data modelling, conditional expectation, Neural ODE

0

0

0

0

3:50

26/08/2020

Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning

Gang Wang, Georgios B. Giannakis

Keywords Paper

0

0

0

0

14:03

18/07/2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

Botao Hao, Yaqi Duan, Tor Lattimore and
Csaba Szepesvari, Mengdi Wang

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:20

06/12/2021

Continuous Latent Process Flows

Ruizhi Deng, Marcus Brubaker, Greg Mori, Andreas M Lehrmann

Keywords Paper

generative model

0

0

0

0

14:54

06/12/2021

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Gregory Dexter, Kevin Bello, Jean Honorio

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:49

06/12/2021

Robustness between the worst and average case

Leslie Rice, Anna Bair, Huan Zhang, J. Zico Kolter

Keywords Paper

machine learning, robustness, adversarial robustness and security, generative model

0

0

0

0

10:46

02/02/2021

Variance Penalized On-Policy and Off-Policy Actor-Critic

Arushi Jain, Gandharv Patil, Ayush Jain and
Khimya Khetarpal, Doina Precup

Keywords Paper

0

0

0

0

17:58

26/08/2020

Kernel Conditional Density Operators

Ingmar Schuster, Mattes Mollenhauer, Stefan Klus, Krikamol Muandet

Keywords Paper

0

0

0

0

14:59

06/12/2021

Collaborative Uncertainty in Multi-Agent Trajectory Forecasting

Bohan Tang, Yiqi Zhong, Ulrich Neumann and
Gang Wang, Siheng Chen, Ya Zhang

Keywords Paper

deep learning

0

0

0

0

7:15