How RL Agents Behave When Their Actions Are Modified

02/02/2021

How RL Agents Behave When Their Actions Are Modified

Eric D. Langlois, Tom Everitt

Keywords:

Abstract Paper Similar Papers

Abstract: Reinforcement learning in complex environments may require supervision to prevent the agent from attempting dangerous actions. As a result of supervisor intervention, the executed action may differ from the action specified by the policy. How does this affect learning? We present the Modified-Action Markov Decision Process, an extension of the MDP model that allows actions to differ from the policy. We analyze the asymptotic behaviours of common reinforcement learning algorithms in this setting and show that they adapt in different ways: some completely ignore modifications while others go to various lengths in trying to avoid action modifications that decrease reward. By choosing the right algorithm, developers can prevent their agents from learning to circumvent interruptions or constraints, and better control agent responses to other kinds of action modification, like self-damage.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38948670

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Advice-Guided Reinforcement Learning in a non-Markovian Environment

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper

0

0

0

0

18:07

06/12/2020

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL

Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn

Keywords Paper

0

0

0

0

3:24

06/12/2020

Safe Reinforcement Learning via Curriculum Induction

Matteo Turchetta, Andrey Kolobov, Shital Shah and
Andreas Krause, Alekh Agarwal

Keywords Paper

0

0

0

0

3:18

06/12/2021

Continual Auxiliary Task Learning

Matthew McLeod, Chunlok Lo, Matthew Schlegel and
Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White

Keywords Paper

reinforcement learning and planning

0

0

0

0

5:36

26/10/2020

Imitation Learning over Heterogeneous Agents with Restraining Bolts

Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi

Keywords Paper

Restraining Bolts, Non-markovian Rewards, Transfer Learning

0

0

0

0

7:50

06/12/2021

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang, Weixun Wang, Hongyao Tang and
Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

Keywords Paper

reinforcement learning and planning, transfer learning

0

0

0

0

15:21

06/12/2021

Time-series Generation by Contrastive Imitation

Daniel Jarrett, Ioana Bica, Mihaela van der Schaar

Keywords Paper

generative model

0

0

0

0

8:47

03/05/2021

Correcting experience replay for multi-agent communication

Sanjeevan Ahilan, Peter Dayan

Keywords Paper

multi-agent reinforcement learning, communication, experience replay, relabelling

1

0

0

0

10:31

06/12/2021

Safe Reinforcement Learning by Imagining the Near Future

Garrett Thomas, Yuping Luo, Tengyu Ma

Keywords Paper

reinforcement learning and planning

2

1

0

0

6:50

06/12/2020

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai, Ishan Durugkar, Haresh Karnan and
Garrett Warnell, Josiah Hanna, Peter Stone

Keywords Paper

0

0

0

0

3:22

06/12/2021

Bridging the Imitation Gap by Adaptive Insubordination

Luca Weihs, Unnat Jain, Iou-Jen Liu and
Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alex Schwing

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:51

06/12/2021

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Briti Gangopadhyay, Pallab Dasgupta

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

12:49

06/12/2020

Fighting Copycat Agents in Behavioral Cloning from Observation Histories

Chuan Wen, Jierui Lin, Trevor Darrell and
Dinesh Jayaraman, Yang Gao

Keywords Paper

, Reinforcement Learning and Planning -> Exploration

0

0

0

0

3:21

02/02/2021

Relative Variational Intrinsic Control

Kate Baumli, David Warde-Farley, Steven Hansen, Volodymyr Mnih

Keywords Paper

0

0

0

0

19:18

06/12/2021

Outcome-Driven Reinforcement Learning via Variational Inference

Tim G. J. Rudner, Vitchyr Pong, Rowan McAllister and
Yarin Gal, Sergey Levine

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

12:21

18/07/2021

Interaction-Grounded Learning

Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:12

19/08/2021

Conditional Self-Supervised Learning for Few-Shot Classification

Yuexuan An, Hui Xue, Xingyu Zhao, Lu Zhang

Keywords Paper

Machine Learning, Classification, Transfer, Adaptation, Multi-task Learning, Unsupervised Learning

0

0

0

0

9:06

18/07/2021

Policy Caches with Successor Features

Mark Nemecek, Ron Parr

Keywords Paper

Reinforcement Learning and Planning, Reinforcement Learning and Planning, Markov Decision Processes; Reinforcement Learning and Planning, Reinforcement Learning

0

0

0

0

5:15

26/04/2020

Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks

Hae Beom Lee, Hayeon Lee, Donghyun Na and
Saehoon Kim, Minseop Park, Eunho Yang, Sung Ju Hwang

Keywords Paper

meta-learning, few-shot learning, Bayesian neural network, variational inference, learning to learn, imbalanced and out-of-distribution tasks for few-shot learning

0

0

0

1

13:46

06/12/2020

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Paper

0

0

0

0

3:31

14/09/2020

Network Cooperation with Progressive Disambiguation for Partial Label Learning

Yao Yao, Chen Gong, Jiehui Deng, Jian Yang

Keywords Paper

weakly-supervised learning, partial label learning, progressive disambiguation, network cooperation

0

0

0

0

10:19

06/12/2021

Discovery of Options via Meta-Learned Subgoals

Vivek Veeriah, Tom Zahavy, Matteo Hessel and
Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

4:13

18/07/2021

Reinforcement Learning of Implicit and Explicit Control Flow Instructions

Ethan Brooks, Janarthanan Rajendran, Richard Lewis, Satinder Singh

Keywords Paper

Optimization, Optimization, Combinatorial Optimization, Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:08

22/09/2020

Contextual meta-bandit for recommender systems selection

Marlesson R. O. Santana, Luckeciano C. Melo, Fernando H. F. Camargo and
Bruno Brandão, Anderson Soares, Renan M. Oliveira, Sandor Caetano

Keywords Paper

contextual bandits, hierarchical recommender systems, options framework, reinforcement learning

0

0

0

0

1:48

12/07/2020

Batch Reinforcement Learning with Hyperparameter Gradients

Byung-Jun Lee, Jongmin Lee, Peter Vrancx and
Dongho Kim, Kee-Eung Kim

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:07

03/05/2021

Fast And Slow Learning Of Recurrent Independent Mechanisms

Kanika Madan, Nan Rosemary Ke, Anirudh Goyal and
Bernhard Schoelkopf, Yoshua Bengio

Keywords Paper

better generalization, modular representations, learning mechanisms

0

0

0

0

5:09

26/04/2020

Learning from Rules Generalizing Labeled Exemplars

Abhijeet Awasthi, Sabyasachi Ghosh, Rasna Goyal, Sunita Sarawagi

Keywords Paper

Learning from Rules, Learning from limited labeled data, Weakly Supervised Learning

0

0

0

0

5:18

03/05/2021

Self-supervised Learning from a Multi-view Perspective

Yao-Hung Hubert Tsai, Yue Wu, Ruslan Salakhutdinov, LP Morency

Keywords Paper

Self-supervised Learning, Unsupervised Learning, Multi-view Representation Learning

0

0

0

0

5:36

06/12/2021

Autonomous Reinforcement Learning via Subgoal Curricula

Archit Sharma, Abhishek Gupta, Sergey Levine and
Karol Hausman, Chelsea Finn

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:09

06/12/2020

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Huan Zhang, Hongge Chen, Chaowei Xiao and
Bo Li, Mingyan Liu, Duane Boning, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:18

18/07/2021

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Sumedh Sontakke, Arash Mehrjou, Laurent Itti, Bernhard Schölkopf

Keywords Paper

Applications, Robotics

0

0

0

0

5:18

26/04/2020

Intrinsic Motivation for Encouraging Synergistic Behavior

Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta

Keywords Paper

reinforcement learning, intrinsic motivation, synergistic, robot manipulation

0

0

0

0

5:02

07/09/2020

Towards a Hypothesis on Visual Transformation based Self-Supervision

Dipan Pal, Sreena Nallamothu, Marios Savvides

Keywords Paper

self supervision, rotation transformation, rot net, visual transformation self supervision

0

0

0

0

7:31

06/12/2021

Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time

Ferran Alet, Maria Bauza, Kenji Kawaguchi and
Nurullah Giray Kuru, Tomás Lozano-Pérez, Leslie Kaelbling

Keywords Paper

deep learning, optimization, machine learning, self-supervised learning, meta learning

0

0

0

0

15:05

26/04/2020

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

Yuping Luo, Huazhe Xu, Tengyu Ma

Keywords Paper

imitation learning, model-based imitation learning, model-based RL, behavior cloning, covariate shift

0

0

0

0

4:38

16/11/2020

Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Thuy-Trang Vu, Dinh Phung, Gholamreza Haffari

Keywords Paper

combinatorial problem, unsupervised tasks, named recognition, broad-coverage models

0

0

0

0

11:57

18/07/2021

TempoRL: Learning When to Act

André Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer

Keywords Paper

Optimization, Non-Convex Optimization, Reinforcement Learning and Planning, Neuroscience and Cognitive Science, Reasoning; Optimization, Combinatorial Optimization; Reinforcement Learning and Plannin

0

0

0

0

5:24

12/07/2020

Self-supervised Label Augmentation via Input Transformations

Hankook Lee, Sung Ju Hwang, Jinwoo Shin

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:34

06/12/2021

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Dweep Trivedi, Jesse Zhang, Shao-Hua Sun, Joseph Lim

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

10:30

16/11/2020

Supervised Seeded Iterated Learning for Interactive Language Learning

Yuchen Lu, Soumye Singhal, Florian Strub and
Olivier Pietquin, Aaron Courville

Keywords Paper

language drift, language-drift game, language models, word-based agents

0

0

0

0

6:56