Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

06/12/2021

Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

Ben Eysenbach, Sergey Levine, Russ Salakhutdinov

Keywords: reinforcement learning and planning, machine learning

Abstract Paper Similar Papers

Abstract: Reinforcement learning (RL) algorithms assume that users specify tasks by manually writing down a reward function. However, this process can be laborious and demands considerable technical expertise. Can we devise RL algorithms that instead enable users to specify tasks simply by providing examples of successful outcomes? In this paper, we derive a control algorithm that maximizes the future probability of these successful outcome examples. Prior work has approached similar problems with a two-stage process, first learning a reward function and then optimizing this reward function using another reinforcement learning algorithm. In contrast, our method directly learns a value function from transitions and successful outcomes, without learning this intermediate reward function. Our method therefore requires fewer hyperparameters to tune and lines of code to debug. We show that our method satisfies a new data-driven Bellman equation, where examples take the place of the typical reward function term. Experiments show that our approach outperforms prior methods that learn explicit reward functions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

03/05/2021

Learning to Reach Goals via Iterated Supervised Learning

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and
Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Paper

goal reaching, reinforcement learning, goal-conditioned RL, behavior cloning

0

0

0

0

15:19

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

06/12/2020

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

Aviral Kumar, Abhishek Gupta, Sergey Levine

Keywords Paper

0

0

0

0

3:25

02/02/2021

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Junhong Shen, Lin F. Yang

Keywords Paper

0

0

0

0

19:12

06/12/2021

Continual Learning via Local Module Composition

Oleksiy Ostapenko, Pau Rodriguez, Massimo Caccia, Laurent Charlin

Keywords Paper

continual learning, transfer learning

1

0

0

1

14:32

06/12/2020

Bayesian Optimization for Iterative Learning

Vu Nguyen, Sebastian Schulze, Michael A Osborne

Keywords Paper

0

0

0

0

3:19

04/07/2020

Learning to Faithfully Rationalize by Construction

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace

Keywords Paper

NLP, neural classification, training, automatic evaluations

0

0

0

0

11:55

26/04/2020

Variational Recurrent Models for Solving Partially Observable Control Tasks

Dongqi Han, Kenji Doya, Jun Tani

Keywords Paper

Reinforcement Learning, Deep Learning, Variational Inference, Recurrent Neural Network, Partially Observable, Robotic Control, Continuous Control

0

0

0

0

4:59

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

06/12/2020

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

0

0

0

0

3:12

03/05/2021

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

Martin Wistuba, Josif Grabocka

Keywords Paper

automl, bayesian optimization, metalearning, few-shot learning

0

0

0

0

5:18

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

06/12/2021

Autonomous Reinforcement Learning via Subgoal Curricula

Archit Sharma, Abhishek Gupta, Sergey Levine and
Karol Hausman, Chelsea Finn

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:09

26/04/2020

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

Keywords Paper

reinforcement learning, semi-supervised learning, unsupervised learning, robotics, deep learning

0

0

0

0

5:07

06/12/2020

Auxiliary Task Reweighting for Minimum-data Learning

Baifeng Shi, Judy Hoffman, Kate Saenko and
Trevor Darrell, Huijuan Xu

Keywords Paper

0

0

0

0

3:28

03/05/2021

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Xuanlin Li, Brandon Trabucco, Dong Huk Park and
Michael Luo, Sheng Shen, trevor darrell, Yang Gao

Keywords Paper

reinforcement learning, computer vision, natural language processing, optimization, variational inference, unsupervised learning

0

0

0

0

4:56

18/07/2021

Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment

Michael Chang, Sid Kaushik, Sergey Levine, Thomas Griffiths

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

16:45

06/12/2021

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

Peisong Wen, Qianqian Xu, Zhiyong Yang and
Yuan He, Qingming Huang

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

7:00

26/04/2020

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

Keywords Paper

Natural Language Processing, Representation Learning

0

0

0

0

5:12

18/07/2021

Linear Transformers Are Secretly Fast Weight Programmers

Imanol Schlag, Kazuki Irie, Jürgen Schmidhuber

Keywords Paper

Deep Learning

0

0

0

0

5:18

06/12/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

deep learning, optimization

0

0

0

0

14:26

18/07/2021

Monotonic Robust Policy Optimization with Model Discrepancy

yuankun jiang, Chenglin Li, Wenrui Dai and
Junni Zou, Hongkai Xiong

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:17

03/05/2021

Hierarchical Reinforcement Learning by Discovering Intrinsic Options

Jesse Zhang, Haonan Yu, Wei Xu

Keywords Paper

reinforcement learning, unsupervised skill discovery, exploration, options, hierarchical reinforcement learning

0

0

0

0

4:58

02/02/2021

Learning by Fixing: Solving Math Word Problems with Weak Supervision

Yining Hong, Qing Li, Daniel Ciao and
Siyuan Huang, Song-Chun Zhu

Keywords Paper

0

0

0

0

13:50

03/05/2021

Learning to Make Decisions via Submodular Regularization

Ayya Alieva, Aiden Aceves, Jialin Song and
Stephen Mayo, Yisong Yue, Yuxin Chen

Keywords Paper

0

0

0

0

5:53

03/05/2021

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Lanqing Li, Rui Yang, Dijun Luo

Keywords Paper

distance metric learning, offline/batch reinforcement learning, meta-reinforcement learning, contrastive learning, multi-task reinforcement learning

1

0

0

0

6:21

06/12/2021

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Dibya Ghosh, Jad Rahme, Aviral Kumar and
Amy Zhang, Ryan Adams, Sergey Levine

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:17

18/07/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

18/07/2021

Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization

Sang Michael Xie, Tengyu Ma, Percy Liang

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

22:15

26/04/2020

Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

Keywords Paper

deep learning analysis, representation learning, meta-learning, few-shot learning

0

0

0

0

5:25

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

15/11/2020

Program Equivalence for Assisted Grading of Functional Programs

Joshua Clune, Vijay Ramamurthy, Ruben Martins, Umut A. Acar

Keywords Paper

Functional Programming, Program Equivalence, Assisted Grading, Formal Methods

0

0

0

0

15:41

02/02/2021

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

Yuqian Jiang, Suda Bharadwaj, Bo Wu and
Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Paper

0

0

0

0

15:40

03/05/2021

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Yulin Wang, Zanlin Ni, Shiji Song and
Le Yang, Gao Huang

Keywords Paper

Deep learning, Locally supervised training

1

0

0

1

5:03

26/04/2020

Neural Text Generation With Unlikelihood Training

Sean Welleck, Ilia Kulikov, Stephen Roller and
Emily Dinan, Kyunghyun Cho, Jason Weston

Keywords Paper

language modeling, machine learning

0

0

0

0

4:20

23/06/2021

Execution Reconstruction: Harnessing Failure Reoccurrences for Failure Reproduction

Gefei Zuo, Jiacheng Ma, Andrew Quinn and
Pramod Bhatotia, Pedro Fonseca, Baris Kasikci

Keywords Paper

debugging, symbolic execution

0

0

0

0

19:28

06/12/2020

Dynamic allocation of limited memory resources in reinforcement learning

Nisheet Patel, Luigi Acerbi, Alexandre Pouget

Keywords Paper

0

0

0

0

3:19

18/07/2021

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

Yifang Chen, Simon Du, Kevin Jamieson

Keywords Paper

, Optimization, Non-Convex Optimization, Theory, Online Learning Theory

0

0

0

0

5:20