Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

12/07/2020

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Daniel Brown, Scott Niekum, Russell Coleman, Ravi Srinivasan

Keywords: Reinforcement Learning - Deep RL

Abstract Paper Similar Papers

Abstract: Bayesian reward learning from demonstrations enables rigorous safety and uncertainty analysis when performing imitation learning. However, Bayesian reward learning methods are typically computationally intractable for complex control problems. We propose a highly efficient Bayesian reward learning algorithm that scales to high-dimensional imitation learning problems by first pre-training a low-dimensional feature encoding via self-supervised tasks and then leveraging preferences over demonstrations to perform fast Bayesian inference via sampling. We evaluate our proposed approach on the task of learning to play Atari games from demonstrations, without access to the game score, and achieve state-of-the-art imitation learning performance. Furthermore, we also demonstrate that our approach enables efficient high-confidence performance bounds for any evaluation policy. We show that these high-confidence performance bounds can be used to accurately rank the performance and risk of a variety of different evaluation policies, despite not having samples of the true reward function.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

ConQUR: Mitigating Delusional Bias in Deep Q-Learning

DiJia Su, Jayden Ooi, Tyler Lu and
Dale Schuurmans, Craig Boutilier

Keywords Paper

Reinforcement Learning - General

0

0

0

0

15:04

26/04/2020

Disagreement-Regularized Imitation Learning

Kiante Brantley, Wen Sun, Mikael Henaff

Keywords Paper

imitation learning, reinforcement learning, uncertainty

0

0

0

0

4:53

16/11/2020

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Paper

0

0

0

0

5:04

06/12/2020

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Fan Zhou, Jianing Wang, Xingdong Feng

Keywords Paper

0

0

0

0

3:11

18/07/2021

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

Scott Fujimoto, David Meger, Doina Precup

Keywords Paper

Reinforcement Learning and Planning, Deep RL

1

0

0

0

3:50

18/07/2021

Learner-Private Convex Optimization

Jiaming Xu, Kuang Xu, Dana Yang

Keywords Paper

Theory, Online Learning Theory

0

0

0

0

5:24

26/04/2020

Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents

Christian Rupprecht, Cyril Ibrahim, Christopher J. Pal

Keywords Paper

Visualization, Reinforcement Learning, Safety

0

0

0

0

4:52

26/04/2020

Fast Task Inference with Variational Intrinsic Successor Features

Steven Hansen, Will Dabney, Andre Barreto and
David Warde-Farley, Tom Van de Wiele, Volodymyr Mnih

Keywords Paper

Reinforcement Learning, Variational Intrinsic Control, Successor Features

0

0

0

0

14:47

02/02/2021

Addressing Action Oscillations through Learning Policy Inertia

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

0

0

0

0

14:57

02/02/2021

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

0

0

0

0

14:30

12/07/2020

Learning Human Objectives by Evaluating Hypothetical Behavior

Siddharth Reddy, Anca Dragan, Sergey Levine and
Shane Legg, Jan Leike

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

10:21

06/12/2021

Provable Representation Learning for Imitation with Contrastive Fourier Features

Ofir Nachum, Mengjiao Yang

Keywords Paper

reinforcement learning and planning, contrastive learning, representation learning

0

0

0

0

15:06

03/05/2021

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer, Ankesh Anand, Rishab Goel and
R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Paper

Representation Learning, Self-Supervised Learning, Reinforcement Learning, Sample Efficiency

0

0

0

1

10:04

12/07/2020

Variational Imitation Learning with Diverse-quality Demonstrations

Voot Tangkaratt, Bo Han, Mohammad Emtiyaz Khan, Masashi Sugiyama

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:52

18/07/2021

Emphatic Algorithms for Deep Reinforcement Learning

Ray Jiang, Tom Zahavy, Zhongwen Xu and
Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

Keywords Paper

Reinforcement Learning and Planning, Deep RL

1

0

0

0

5:21

26/04/2020

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Paper

Imitation Learning, Reinforcement Learning

0

0

0

0

4:38

14/06/2020

QEBA: Query-Efficient Boundary-Based Blackbox Attack

Huichen Li, Xiaojun Xu, Xiaolu Zhang and
Shuang Yang, Bo Li

Keywords Paper

adversarial machine learning, black-box attack, boundary-based attack, attacking public api

0

0

0

0

1:01

06/12/2021

Supervising the Transfer of Reasoning Patterns in VQA

Corentin Kervadec, Christian Wolf, Grigory Antipov and
Moez Baccouche, Madiha Nadri

Keywords Paper

theory, deep learning, vision

0

0

0

0

12:54

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25

06/12/2021

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

reinforcement learning and planning

0

0

0

0

19:36

26/08/2020

Mixed Strategies for Robust Optimization of Unknown Objectives

Pier Giuseppe Sessa, Ilija Bogunovic, Maryam Kamgarpour, Andreas Krause

Keywords Paper

0

0

0

0

14:13

14/09/2020

Treant: Training Evasion-Aware Decision Trees

Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei and
Seyum Assefa Abebe, Salvatore Orland

Keywords Paper

0

0

0

0

18:49

13/04/2021

Good classifiers are abundant in the interpolating regime

Ryan Theisen, Jason Klusowski, Michael Mahoney

Keywords Paper

0

0

0

0

2:59

13/04/2021

Robust imitation learning from noisy demonstrations

Voot Tangkaratt, Nontawat Charoenphakdee, Masashi Sugiyama

Keywords Paper

0

0

0

0

2:31

02/02/2021

DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate and
Tom Melham, Daniel Kroening

Keywords Paper

0

0

0

0

15:45

02/02/2021

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

YooJung Choi, Meihua Dang, Guy Van den Broeck

Keywords Paper

0

0

0

0

19:30

06/12/2020

From Predictions to Decisions: Using Lookahead Regularization

Nir Rosenfeld, Sophie Hilgard, Sai Ravindranath, David Parkes

Keywords Paper

0

0

0

0

3:10

18/07/2021

Hyperparameter Selection for Imitation Learning

Léonard Hussenot, Marcin Andrychowicz, Damien Vincent and
Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphael Marinier, Lukasz Stafiniak, Emmanuel Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

16:23

02/02/2021

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

Denis Yarats, Amy Zhang, Ilya Kostrikov and
Brandon Amos, Joelle Pineau, Rob Fergus

Keywords Paper

0

0

0

0

12:19

18/07/2021

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

1

0

0

5:46

18/07/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Zaynah Javed, Daniel Brown, Satvik Sharma and
Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca Dragan, Ken Goldberg

Keywords Paper

Social Aspects of Machine Learning, AI Safety

0

0

0

1

5:10

12/07/2020

No-Regret and Incentive-Compatible Online Learning

Rupert Freeman, David Pennock, Chara Podimata, Jennifer Wortman Vaughan

Keywords Paper

Learning Theory

0

0

0

0

13:48

26/04/2020

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

Yuping Luo, Huazhe Xu, Tengyu Ma

Keywords Paper

imitation learning, model-based imitation learning, model-based RL, behavior cloning, covariate shift

0

0

0

0

4:38

18/07/2021

Backdoor Scanning for Deep Neural Networks through K-Arm Optimization

Guangyu Shen, Yingqi Liu, Guanhong Tao and
Shengwei An, Qiuling Xu, Siyuan Cheng, Shiqing Ma, Xiangyu Zhang

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:12

06/12/2020

Trade-offs and Guarantees of Adversarial Representation Learning for Information Obfuscation

Han Zhao, Jianfeng Chi, Yuan Tian, Geoffrey Gordon

Keywords Paper

0

0

0

0

3:17

26/04/2020

Model Based Reinforcement Learning for Atari

Łukasz Kaiser, Mohammad Babaeizadeh, Piotr Miłos and
Błażej Osiński, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

Keywords Paper

reinforcement learning, model based rl, video prediction model, atari

0

0

0

0

5:02

06/12/2021

Visual Adversarial Imitation Learning using Variational Models

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

Keywords Paper

theory, reinforcement learning and planning, adversarial robustness and security, representation learning

0

0

0

0

7:25

19/08/2021

Masked Contrastive Learning for Anomaly Detection

Hyunsoo Cho, Jinseok Seol, Sang-goo Lee

Keywords Paper

Data Mining, Anomaly/Outlier Detection, Clustering, Clustering

0

0

0

0

14:12

15/06/2020

Proving data-poisoning robustness in decision trees

Samuel Drews, Aws Albarghouthi, Loris D’Antoni

Keywords Paper

Poisoning, Decision Trees, Robustness, Adversarial Machine Learning, Abstract Interpretation

0

0

0

0

16:29

26/04/2020

Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards

Allan Zhou, Eric Jang, Daniel Kappler and
Alex Herzog, Mohi Khansari, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Sergey Levine, Chelsea Finn

Keywords Paper

meta-learning, reinforcement learning, imitation learning

0

0

0

0

4:34