Learning from Suboptimal Demonstration via Self-Supervised Reward Regression

16/11/2020

Learning from Suboptimal Demonstration via Self-Supervised Reward Regression

Letian Chen, Rohan Paleja, Matthew Gombolay

Keywords:

Abstract Paper Similar Papers

Abstract: Learning from Demonstration (LfD) seeks to democratize robotics by enabling non-roboticist end-users to teach robots to perform a task by providing a human demonstration. However, modern LfD techniques, e.g. inverse reinforcement learning (IRL), assume users provide at least stochastically optimal demonstrations. This assumption fails to hold in most real-world scenarios. Recent attempts to learn from sub-optimal demonstration leverage pairwise rankings and following the Luce-Shepard rule. However, we show these approaches make incorrect assumptions and thus suffer from brittle, degraded performance. We overcome these limitations in developing a novel approach that bootstraps off suboptimal demonstrations to synthesize optimality-parameterized data to train an idealized reward function. We empirically validate we learn an idealized reward function with ~0.95 correlation with ground-truth reward versus ~0.75 for prior work. We can then train policies achieving ~200% improvement over the suboptimal demonstration and ~90% improvement over prior work. We present a physical demonstration of teaching a robot a topspin strike in table tennis that achieves 32% faster returns and 40% more topspin than user demonstration.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CoRL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing

Mete Akbulut, Erhan Oztop, Muhammet Yunus Seker and
Hh X, Ahmet Tekden, Emre Ugur

Keywords Paper

0

0

0

0

5:03

16/11/2020

Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion

Roland Hafner, Tim Hertweck, Philipp Kloeppner and
Michael Bloesch, Michael Neunert, Markus Wulfmeier, Saran Tunyasuvunakool, Nicolas Heess, Martin Riedmiller

Keywords Paper

0

0

0

0

5:24

16/11/2020

Learning from Demonstrations using Signal Temporal Logic

Aniruddh Puranic, Jyotirmoy Deshmukh, Stefanos Nikolaidis

Keywords Paper

0

0

0

0

4:52

16/11/2020

Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning

Ryan Julian, Benjamin Swanson, Gaurav Sukhatme and
Sergey Levine, Chelsea Finn, Karol Hausman

Keywords Paper

0

0

0

0

5:47

03/05/2021

Model-Based Visual Planning with Self-Supervised Functional Distances

Stephen Tian, Suraj Nair, Frederik Ebert and
Sudeep Dasari, Ben Eysenbach, Chelsea Finn, Sergey Levine

Keywords Paper

reinforcement learning, distance learning, model learning, robotics, planning

0

0

0

0

9:11

26/04/2020

Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning

Noah Siegel, Jost Tobias Springenberg, Felix Berkenkamp and
Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin Riedmiller

Keywords Paper

Reinforcement Learning, Off-policy, Multitask, Continuous Control

0

0

0

0

5:04

06/12/2021

Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives

Murtaza Dalal, Deepak Pathak, Russ Salakhutdinov

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

10:01

16/11/2020

Transformers for One-Shot Visual Imitation

Sudeep Dasari, Abhinav Gupta

Keywords Paper

0

0

0

0

5:06

06/12/2021

Evolution Gym: A Large-Scale Benchmark for Evolving Soft Robots

Jagdeep Bhatia, Holly Jackson, Yunsheng Tian and
Jie Xu, Wojciech Matusik

Keywords Paper

optimization, reinforcement learning and planning, machine learning

0

0

0

0

13:48

06/12/2021

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Songyuan Zhang, ZHANGJIE CAO, Dorsa Sadigh, Yanan Sui

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:50

16/11/2020

Visual Imitation Made Easy

Sarah Young, Dhiraj Gandhi, Shubham Tulsiani and
Abhinav Gupta, Pieter Abbeel, Lerrel Pinto

Keywords Paper

0

0

0

0

5:06

02/02/2021

Enabling Fast Instruction-Based Modification of Learned Robot Skills

Tyler Frasca, Bradley Oosterveld, Meia Chita-Tegmark, Matthias Scheutz

Keywords Paper

0

0

0

0

14:52

03/05/2021

Extracting Strong Policies for Robotics Tasks from Zero-Order Trajectory Optimizers

Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Georg Martius

Keywords Paper

policy learning, zero-order optimization, reinforcement learning, model predictive control, robotics, model-based learning

0

0

0

0

5:09

18/07/2021

Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment

Philip Ball, Cong Lu, Jack Parker-Holder, Stephen Roberts

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:35

16/11/2020

Learning Physical Common Sense as Knowledge Graph Completion via BERT Data Augmentation and Constrained Tucker Factorization

Zhenjie Zhao, Evangelos Papalexakis, Xiaojuan Ma

Keywords Paper

human-robot interaction, physical learning, natural processing, model generalization

0

0

0

0

6:42

18/07/2021

RRL: Resnet as representation for Reinforcement Learning

Rutav Shah, Vikash Kumar

Keywords Paper

Applications, Applications, Computer Vision; Deep Learning, Deep Autoencoders; Deep Learning, Generative Models; Probabilistic Methods , Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:13

16/11/2020

Chaining Behaviors from Data with Model-Free Reinforcement Learning

Avi Singh, Albert Yu, Jonathan Yang and
Jesse Zhang, Aviral Kumar, Sergey Levine

Keywords Paper

0

0

0

0

5:01

16/11/2020

Learning Object Manipulation Skills via Approximate State Estimation from Real Videos

Vladimír Petrík, Makarand Tapaswi, Ivan Laptev, Josef Sivic

Keywords Paper

0

0

0

0

5:07

16/11/2020

Robot Action Selection Learning via Layered Dimension Informed Program Synthesis

Jarrett Holtz, Arjun Guha, Joydeep Biswas

Keywords Paper

0

0

0

0

5:05

16/11/2020

Multi-Modal Anomaly Detection for Unstructured and Uncertain Environments

Tianchen Ji, Sri Theja Vuppala, Girish Chowdhary, Katherine Driggs-Campbell

Keywords Paper

0

0

0

0

5:04

18/07/2021

Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills

Yevgen Chebotar, Karol Hausman, Yao Lu and
Ted Xiao, Dmitry Kalashnikov, Jacob Varley, Alex Irpan, Benjamin Eysenbach, Ryan C Julian, Chelsea Finn, Sergey Levine

Keywords Paper

Applications, Robotics

0

0

0

0

5:20

06/12/2020

Language-Conditioned Imitation Learning for Robot Manipulation Tasks

Simon Stepputtis, Joseph Campbell, Mariano Phielipp and
Stefan Lee, Chitta Baral, Heni Ben Amor

Keywords Paper

0

0

0

0

3:09

14/06/2020

RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real

Kanishka Rao, Chris Harris, Alex Irpan and
Sergey Levine, Julian Ibarz, Mohi Khansari

Keywords Paper

robotics, sim2real, cyclegan, reinforcement learning, grasping, q-learning

0

0

0

0

4:55

16/11/2020

CoT-AMFlow: Adaptive Modulation Network with Co-Teaching Strategy for Unsupervised Optical Flow Estimation

Hengli Wang, Rui Fan, Ming Liu

Keywords Paper

0

0

0

0

4:57

16/11/2020

Deep Reactive Planning in Dynamic Environments

Kei Ota, Devesh Jha, Tadashi Onishi and
Asako Kanezaki, Yusuke Yoshiyasu, Yoko Sasaki, Toshisada Mariyama, Daniel Nikovski

Keywords Paper

0

0

0

0

5:05

03/05/2021

Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency

Qiang Zhang, Tete Xiao, Alyosha Efros and
Lerrel Pinto, Xiaolong Wang

Keywords Paper

self-supervised learning, robotics

0

0

0

0

14:38

05/04/2021

Learning Fitness Functions for Machine Programming

Shantanu Mandal, Todd Anderson, Javier Turek and
Justin Gottschlich, Shengtian Zhou, Abdullah Muzahid

Keywords Paper

0

0

0

0

17:28

05/04/2021

Learning Fitness Functions for Machine Programming

Shantanu Mandal, Todd Anderson, Javier Turek and
Justin Gottschlich, Shengtian Zhou, Abdullah Muzahid

Keywords Paper

0

0

0

0

5:06

02/02/2021

BT Expansion: a Sound and Complete Algorithm for Behavior Planning of Intelligent Robots with Behavior Trees

Zhongxuan Cai, Minglong Li, Wanrong Huang, Wenjing Yang

Keywords Paper

0

0

0

0

15:49

16/11/2020

Learning a Decentralized Multi-Arm Motion Planner

Huy Ha, Jingxi Xu, Shuran Song

Keywords Paper

0

0

0

0

3:41

02/02/2021

CMAX++ : Leveraging Experience in Planning and Execution using Inaccurate Models

Anirudh Vemula, J. Andrew Bagnell, Maxim Likhachev

Keywords Paper

0

0

0

0

15:11

03/05/2021

CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning

Ossama Ahmed, Frederik Träuble, Anirudh Goyal and
Alexander Neitz, Manuel Wuthrich, Yoshua Bengio, Bernhard Schoelkopf, Stefan Bauer

Keywords Paper

reinforcement learning, transfer learning, robotics, domain adaptation, generalization, causality, sim2real transfer

0

0

0

0

5:03

06/12/2021

Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics

Ingmar Schubert, Danny Driess, Ozgur S. Oguz, Marc Toussaint

Keywords Paper

reinforcement learning and planning

0

0

0

0

8:36

18/07/2021

Monotonic Robust Policy Optimization with Model Discrepancy

yuankun jiang, Chenglin Li, Wenrui Dai and
Junni Zou, Hongkai Xiong

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:17

16/11/2020

Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning

Tianjian Chen, Zhanpeng He, Matei Ciocarlie

Keywords Paper

0

0

0

0

4:51

06/12/2020

Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis

Ye Yuan, Kris Kitani

Keywords Paper

0

0

0

0

3:22

12/07/2020

Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

Vitchyr Pong, Murtaza Dalal, Steven Lin and
Ashvin Nair, Shikhar Bahl, Sergey Levine

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:13

06/12/2020

Neural Dynamic Policies for End-to-End Sensorimotor Learning

Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak

Keywords Paper

0

0

0

0

3:35

22/11/2021

MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation

Yepeng Liu, Zaiwang Gu, Shenghua Gao and
Dong Wang, Yusheng Zeng, Jun Cheng

Keywords Paper

face detect, head pose estimation, multi-task, Low Latency

0

0

0

0

2:51

26/04/2020

The Ingredients of Real World Robotic Reinforcement Learning

Henry Zhu, Justin Yu, Abhishek Gupta and
Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine

Keywords Paper

Reinforcement Learning, Robotics

0

0

0

0

4:32