PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

16/11/2020

PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

Prasoon Goyal, Scott Niekum, Raymond Mooney

Keywords:

Abstract Paper Code Similar Papers

Abstract: Reinforcement learning (RL), particularly in sparse reward settings, often requires prohibitively large numbers of interactions with the environment, thereby limiting its applicability to complex problems. To address this, several prior approaches have used natural language to guide the agent’s exploration. However, these approaches typically operate on structured representations of the environment, and/or assume some structure in the natural language commands. In this work, we propose a model that directly maps pixels to rewards, given a free-form natural language description of the task, which can then be used for policy learning. Our experiments on the Meta-World robot manipulation domain show that language-based rewards significantly improves the sample efficiency of policy learning, both in sparse and dense reward settings.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CoRL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Paper

0

0

0

0

5:04

19/08/2021

Don’t Do What Doesn’t Matter: Intrinsic Motivation with Action Usefulness

Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Keywords Paper

Machine Learning, Reinforcement Learning, Deep Reinforcement Learning

0

0

0

0

14:48

06/12/2021

Outcome-Driven Reinforcement Learning via Variational Inference

Tim G. J. Rudner, Vitchyr Pong, Rowan McAllister and
Yarin Gal, Sergey Levine

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

12:21

06/12/2020

Effective Diversity in Population Based Reinforcement Learning

Jack Parker-Holder, Aldo Pacchiano, Krzysztof M Choromanski, Stephen J Roberts

Keywords Paper

0

0

0

0

3:23

18/07/2021

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

Sungryull Sohn, Sungtae Lee, Jongwook Choi and
Harm van Seijen, Mehdi Fatemi, Honglak Lee

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:19

06/12/2021

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang, Weixun Wang, Hongyao Tang and
Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

Keywords Paper

reinforcement learning and planning, transfer learning

0

0

0

0

15:21

06/12/2020

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

Keywords Paper

0

0

0

0

3:13

03/05/2021

Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks

Ingmar Schubert, Oz Oguz, Marc Toussaint

Keywords Paper

reinforcement learning, robotics, robotic manipulation, plan-based reward shaping, reward shaping

0

0

0

0

4:38

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

19/08/2021

Reward-Constrained Behavior Cloning

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Constraint Optimization

0

0

0

0

14:43

06/12/2020

Cooperative Heterogeneous Deep Reinforcement Learning

Han Zheng, Pengfei Wei, Jing Jiang and
Guodong Long, Qinghua Lu, Chengqi Zhang

Keywords Paper

0

0

0

0

3:08

19/04/2021

Exploring supervised and unsupervised rewards in machine translation

Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Keywords Paper

0

0

0

0

10:52

03/05/2021

Task-Agnostic Morphology Evolution

Donald Hejna III, Pieter Abbeel, Lerrel Pinto

Keywords Paper

evolution, morphology, empowerment, unsupervised, information theory

0

0

0

0

3:59

02/02/2021

Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework

Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li

Keywords Paper

0

0

0

0

16:03

06/12/2021

Contrastive Active Inference

Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt

Keywords Paper

theory, reinforcement learning and planning, generative model

0

0

0

0

10:46

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

06/12/2021

Visual Adversarial Imitation Learning using Variational Models

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

Keywords Paper

theory, reinforcement learning and planning, adversarial robustness and security, representation learning

0

0

0

0

7:25

16/11/2020

DORB: Dynamically Optimizing Multiple Rewards with Bandits

Ramakanth Pasunuru, Han Guo, Mohit Bansal

Keywords Paper

language tasks, optimization rewards, nlg tasks, question generation

0

0

0

0

11:34

06/12/2020

Learning Guidance Rewards with Trajectory-space Smoothing

Tanmay Gangwani, Yuan Zhou, Jian Peng

Keywords Paper

0

0

0

0

3:16

26/08/2020

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

Keywords Paper

0

0

0

0

13:49

02/02/2021

Advice-Guided Reinforcement Learning in a non-Markovian Environment

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper

0

0

0

0

18:07

26/08/2020

Nested-Wasserstein Self-Imitation Learning for Sequence Generation

Ruiyi Zhang, Changyou Chen, Zhe Gan and
Zheng Wen, Wenlin Wang, Lawrence Carin

Keywords Paper

0

0

0

0

11:18

02/02/2021

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Junhong Shen, Lin F. Yang

Keywords Paper

0

0

0

0

19:12

12/07/2020

Optimizing Multiagent Cooperation via Policy Evolution and Shared Experiences

Somdeb Majumdar, Shauharda Khadka, Santiago Miret and
Stephen Mcaleer, Kagan Tumer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:53

16/11/2020

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments

Jun Yamada, Youngwoon Lee, Gautam Salhotra and
Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Paper

0

0

0

0

4:59

06/12/2021

Curriculum Offline Imitating Learning

Minghuan Liu, Hanye Zhao, Zhengyu Yang and
Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:28

26/04/2020

Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning

Noah Siegel, Jost Tobias Springenberg, Felix Berkenkamp and
Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin Riedmiller

Keywords Paper

Reinforcement Learning, Off-policy, Multitask, Continuous Control

0

0

0

0

5:04

03/05/2021

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

Anurag Ajay, Aviral Kumar, Pulkit Agrawal and
Sergey Levine, Ofir Nachum

Keywords Paper

Unsupervised Learning, Offline Reinforcement Learning, Primitive Discovery

0

0

0

0

5:08

26/08/2020

A Nonparametric Off-Policy Policy Gradient

Samuele Tosatto, Joao Carvalho, Hany Abdulsamad, Jan Peters

Keywords Paper

0

0

0

0

12:19

03/05/2021

Mutual Information State Intrinsic Control

Rui Zhao, Yang Gao, Pieter Abbeel and
Volker Tresp, Wei Xu

Keywords Paper

Intrinsic Motivation, Intrinsic Reward, Intrinsically Motivated Reinforcement Learning, Deep Reinforcement Learning, Reinforcement Learning

0

0

0

0

9:55

26/04/2020

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

Roberta Raileanu, Tim Rocktäschel

Keywords Paper

reinforcement learning, exploration, curiosity

0

0

0

0

4:48

18/07/2021

Decoupling Value and Policy for Generalization in Reinforcement Learning

Roberta Raileanu, Rob Fergus

Keywords Paper

Theory, Learning Theory, Theory, Large Deviations and Asymptotic Analysis, Reinforcement Learning and Planning, Deep RL

0

0

0

0

16:35

12/07/2020

Flexible and Efficient Long-Range Planning Through Curious Exploration

Aidan Curtis, Minjian Xin, Dilip Arumugam and
Kevin Feigelis, Daniel Yamins

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:25

16/11/2020

Contrastive Variational Reinforcement Learning for Complex Observations

Xiao Ma, SIWEI CHEN, David Hsu, Wee Sun Lee

Keywords Paper

0

0

0

0

5:03

03/05/2021

Learning What To Do by Simulating the Past

David Lindner, Rohin Shah, Pieter Abbeel, Anca Dragan

Keywords Paper

reinforcement learning, imitation learning, reward learning

0

0

0

0

5:09

06/12/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and
Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:48

02/02/2021

Adaptive Prior-Dependent Correction Enhanced Reinforcement Learning for Natural Language Generation

Wei Cheng, Ziyan Luo, Qiyue Yin

Keywords Paper

0

0

0

0

13:53

02/02/2021

Addressing Action Oscillations through Learning Policy Inertia

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

0

0

0

0

14:57

26/04/2020

Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control

Tsui-Wei Weng, Krishnamurthy (Dj) Dvijotham, Jonathan Uesato and
Kai Xiao, Sven Gowal, Robert Stanforth*, Pushmeet Kohli

Keywords Paper

deep learning, reinforcement learning, robustness, adversarial examples

0

0

0

0

6:00

06/12/2020

The Value Equivalence Principle for Model-Based Reinforcement Learning

Christopher Grimm, Andre Barreto, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:19