Imitation Learning over Heterogeneous Agents with Restraining Bolts

Abstract: A common problem in Reinforcement Learning (RL) is that the reward function is hard to express. This can be overcome by resorting to Inverse Reinforcement Learning (IRL), which consists in first obtaining a reward function from a set of execution traces generated by the expert agent, and then making the learning agent learn the expert's behavior --this is known as Transfer Learning (TL). Typical IRL solutions rely on a numerical representation of the reward function, which raises problems related to the adopted optimization procedures. We describe a TL method where the execution traces generated by the expert agent, possibly via planning, are used to produce a logical (as opposed to numerical) specification of the reward function, to be incorporated in a device known as Restraining Bolt (RB). The RB can be attached to the learning agent to drive the learning process and ultimately make it imitate the expert. We show that TL can be applied to heterogeneous agents, with the expert, the learner and the RB using different representations of the environment's actions and states, without specifying mappings among their representations.

03/05/2021

Imitation Learning over Heterogeneous Agents with Restraining Bolts

Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi

Comments

Similar Papers

Regularized Inverse Reinforcement Learning

Wonseok Jeon, Chen-Yang Su, Paul Barde and Thang Doan, Derek Nowrouzezahrai, Joelle Pineau

Keywords Abstract Paper

reinforcement learning, regularized markov decision processes, reward learning, inverse reinforcement learning

Active deep Q-learning with demonstration

Si-An Chen,Hsuan-Tien Lin, Voot Tangkaratt, Masashi Sugiyam

Keywords Abstract Paper

Provably Efficient Learning of Transferable Rewards

Alberto Maria Metelli, Giorgia Ramponi, Alessandro Concetti, Marcello Restelli

Keywords Abstract Paper

Optimization, Convex Optimization, Reinforcement Learning and Planning, Optimization, Combinatorial Optimization

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

Yuqian Jiang, Suda Bharadwaj, Bo Wu and Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Abstract Paper

Advice-Guided Reinforcement Learning in a non-Markovian Environment

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Abstract Paper

Exploring supervised and unsupervised rewards in machine translation

Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Keywords Abstract Paper

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Gregory Dexter, Kevin Bello, Jean Honorio

Keywords Abstract Paper

theory, reinforcement learning and planning

Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions

Zhengxian Lin, Kin-Ho Lam, Alan Fern

Keywords Abstract Paper

Deep Reinforcement Learning, Explainable AI

Information Directed Reward Learning for Reinforcement Learning

David Lindner, Matteo Turchetta, Sebastian Tschiatschek and Kamil Ciosek, Andreas Krause

Keywords Abstract Paper

reinforcement learning and planning, active learning

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Dexter R.R. Scobee, S. Shankar Sastry

Keywords Abstract Paper

learning from demonstration, inverse reinforcement learning, constraint inference

What Can Learned Intrinsic Rewards Capture?

Zeyu Zheng, Junhyuk Oh, Matteo Hessel and Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation

Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang

Keywords Abstract Paper

Semi-Supervised Learning, generalization function, Stochastic Estimation, Dialogue optimization

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Abstract Paper

optimization, reinforcement learning and planning, interpretability

Cross-domain Imitation from Observations

Dripta S. Raychaudhuri, Sujoy Paul, Jeroen Vanbaar, Amit Roy-Chowdhury

Keywords Abstract Paper

Reinforcement Learning and Planning, Deep RL

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Yujing Hu, Weixun Wang, Hangtian Jia and Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

Keywords Abstract Paper

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Abstract Paper

Reinforcement Learning and Planning, Multi-Agent RL

Inverse Reinforcement Learning from a Gradient-based Learner

Giorgia Ramponi, Gianluca Drappo, Marcello Restelli

Keywords Abstract Paper

What Did You Think Would Happen? Explaining Agent Behaviour through Intended Outcomes

Herman Yau, Chris Russell, Simon Hadfield

Keywords Abstract Paper

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Abstract Paper

Imitation Learning, Reinforcement Learning

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Abstract Paper

Joint Inference of Reward Machines and Policies for Reinforcement Learning

Zhe Xu, Ivan Gavran, Yousef Ahmad and Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Abstract Paper

Reward Machines, Automata Learning, Reinforcement Learning

Wonseok Jeon, Chen-Yang Su, Paul Barde and
Thang Doan, Derek Nowrouzezahrai, Joelle Pineau

Keywords Paper

Keywords Paper

Keywords Paper

Yuqian Jiang, Suda Bharadwaj, Bo Wu and
Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Paper

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

David Lindner, Matteo Turchetta, Sebastian Tschiatschek and
Kamil Ciosek, Andreas Krause

Keywords Paper

Keywords Paper

Zeyu Zheng, Junhyuk Oh, Matteo Hessel and
Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yujing Hu, Weixun Wang, Hangtian Jia and
Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

Keywords Paper

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhe Xu, Ivan Gavran, Yousef Ahmad and
Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Paper

Tianwei Ni, Harshit Sikchi, Yufei Wang and
Tejus Gupta, Lisa Lee, Ben Eysenbach

Keywords Paper

Keywords Paper

Keywords Paper

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Keywords Paper

Keywords Paper

Ruiyi Zhang, Changyou Chen, Zhe Gan and
Zheng Wen, Wenlin Wang, Lawrence Carin

Keywords Paper

Ben Eysenbach, Shreyas Chaudhari, Swapnil Asawa and
Sergey Levine, Ruslan Salakhutdinov

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tuomas Oikarinen, Wang Zhang, Alexandre Megretski and
Luca Daniel, Tsui-Wei Weng

Keywords Paper

Keywords Paper

Tim G. J. Rudner, Vitchyr Pong, Rowan McAllister and
Yarin Gal, Sergey Levine

Keywords Paper

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper