Learning Behaviors with Uncertain Human Feedback

03/08/2020

Learning Behaviors with Uncertain Human Feedback

Xu He, Haipeng Chen, Bo An

Keywords:

Abstract Paper Similar Papers

Abstract: Human feedback is widely used to train agents in many domains. However, previous works rarely consider the uncertainty when humans provide feedback, especially in cases that the optimal actions are not obvious to the trainers. For example, the reward of a sub-optimal action can be stochastic and sometimes exceeds that of the optimal action, which is common in games or real-world. Trainers are likely to provide positive feedback to sub-optimal actions, negative feedback to the optimal actions and even do not provide feedback in some confusing situations. Existing works, which utilize the Expectation Maximization (EM) algorithm and treat the feedback model as hidden parameters, do not consider uncertainties in the learning environment and human feedback. To address this challenge, we introduce a novel feedback model that considers the uncertainty of human feedback. However, this incurs intractable calculus in the EM algorithm. To this end, we propose a novel approximate EM algorithm, in which we approximate the expectation step with the Gradient Descent method. Experimental results in both synthetic scenarios and two real-world scenarios with human participants demonstrate the superior performance of our proposed approach.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at UAI 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

16/11/2020

Interactive Imitation Learning in State-Space

Snehal Jauhri, Carlos Celemin, Jens Kober

Keywords Paper

0

0

0

0

5:05

26/08/2020

Calibrated Prediction with Covariate Shift via Unsupervised Domain Adaptation

Sangdon Park, Osbert Bastani, James Weimer, Insup Lee

Keywords Paper

0

0

0

0

7:29

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25

12/07/2020

Variational Imitation Learning with Diverse-quality Demonstrations

Voot Tangkaratt, Bo Han, Mohammad Emtiyaz Khan, Masashi Sugiyama

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:52

26/04/2020

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Paper

Imitation Learning, Reinforcement Learning

0

0

0

0

4:38

18/07/2021

Interactive Learning from Activity Description

Khanh Nguyen, Dipendra Misra, Robert Schapire and
Miro Dudik, Patrick Shafto

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:57

18/07/2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Kimin Lee, Laura Smith, Pieter Abbeel

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:02

18/07/2021

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

1

0

0

5:46

26/10/2020

Utilising Uncertainty for Efficient Learning of Likely-Admissible Heuristics

Ofir Marom, Benjamin Rosman

Keywords Paper

Learning Heuristics, Uncertainty, Bayesian Neural Networks, Efficient Exploration, Likely-Admissible Heuristics

0

0

0

0

9:58

06/12/2020

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL

Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn

Keywords Paper

0

0

0

0

3:24

06/12/2020

Self-Adaptive Training: beyond Empirical Risk Minimization

Lang Huang, Chao Zhang, Hongyang Zhang

Keywords Paper

Deep Learning -> Generative Models, Algorithms -> Semi-Supervised Learning

0

0

0

0

3:23

06/12/2020

From Predictions to Decisions: Using Lookahead Regularization

Nir Rosenfeld, Sophie Hilgard, Sai Ravindranath, David Parkes

Keywords Paper

0

0

0

0

3:10

06/12/2020

AvE: Assistance via Empowerment

Yuqing Du, Stas Tiomkin, Emre Kiciman and
Daniel Polani, Pieter Abbeel, Anca Dragan

Keywords Paper

0

0

0

0

3:23

26/04/2020

State-only Imitation with Transition Dynamics Mismatch

Tanmay Gangwani, Jian Peng

Keywords Paper

Imitation learning, Reinforcement Learning, Inverse Reinforcement Learning

0

0

0

1

4:49

06/12/2020

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

Vitaly Feldman, Chiyuan Zhang

Keywords Paper

0

0

0

0

3:22

09/07/2020

Precise Tradeoffs in Adversarial Training for Linear Regression

Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani

Keywords Paper

Adversarial learning and robustness, High-dimensional statistics, Regression

0

0

0

0

15:49

12/07/2020

More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models

Lin Chen, Yifei Min, Mingrui Zhang, Amin Karbasi

Keywords Paper

Adversarial Examples

0

0

0

0

12:01

03/05/2021

Learning to Reach Goals via Iterated Supervised Learning

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and
Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Paper

goal reaching, reinforcement learning, goal-conditioned RL, behavior cloning

0

0

0

0

15:19

18/07/2021

Targeted Data Acquisition for Evolving Negotiation Agents

Minae Kwon, Sidd Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:15

12/07/2020

Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

Rares-Darius Buhai, Yoni Halpern, Yoon Kim and
Andrej Risteski, David Sontag

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

15:04

19/08/2021

Masked Contrastive Learning for Anomaly Detection

Hyunsoo Cho, Jinseok Seol, Sang-goo Lee

Keywords Paper

Data Mining, Anomaly/Outlier Detection, Clustering, Clustering

0

0

0

0

14:12

18/07/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Zaynah Javed, Daniel Brown, Satvik Sharma and
Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca Dragan, Ken Goldberg

Keywords Paper

Social Aspects of Machine Learning, AI Safety

0

0

0

1

5:10

18/07/2021

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

Yue Wu, Shuangfei Zhai, Nitish Srivastava and
Josh Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:01

02/02/2021

Exploratory Machine Learning with Unknown Unknowns

Peng Zhao, Yu-Jie Zhang, Zhi-Hua Zhou

Keywords Paper

0

0

0

0

21:39

06/12/2021

The Utility of Explainable AI in Ad Hoc Human-Machine Teaming

Rohan Paleja, Muyleng Ghuy, Nadun Ranawaka Arachchige and
Reed Jensen, Matthew Gombolay

Keywords Paper

machine learning, interpretability

0

0

0

0

12:32

18/07/2021

Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation

Christopher Dance, Perez Julien, Théo Cachet

Keywords Paper

Reinforcement Learning and Planning, Planning and Control

0

0

0

0

5:13

25/04/2020

"Why is 'Chicago' deceptive?" Towards Building Model-Driven Tutorials for Humans

Vivian Lai, Han Liu, Chenhao Tan

Keywords Paper

explanations, interpretable machine learning, tutorials, deception detection

0

0

0

0

15:14

14/06/2020

Single-Step Adversarial Training With Dropout Scheduling

Vivek B.S., R. Venkatesh Babu

Keywords Paper

adversarial training, robustness, efficient training, representation learning, generalization, supervised learning, recognition, classification, neural networks, deep learning

0

0

0

0

1:01

06/12/2020

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

Wangchunshu Zhou, Jinyi Hu, Hanlin Zhang and
Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang

Keywords Paper

0

0

0

0

3:22

03/05/2021

Learning Value Functions in Deep Policy Gradients using Residual Variance

Yannis Flet-Berliac, reda ouhamma, odalric-ambrym maillard, philippe preux

Keywords Paper

0

0

0

0

4:49

14/09/2020

Active deep Q-learning with demonstration

Si-An Chen,Hsuan-Tien Lin, Voot Tangkaratt, Masashi Sugiyam

Keywords Paper

0

0

0

0

13:42

22/09/2020

Keeping dataset biases out of the simulation: A debiased simulator for reinforcement learning based recommender systems

Jin Huang, Harrie Oosterhuis, Maarten Rijke, Herke Hoof

Keywords Paper

Recommender systems, Simulation, Interaction bias, Reinforcement learning

0

0

0

0

2:45

06/12/2021

Visual Adversarial Imitation Learning using Variational Models

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

Keywords Paper

theory, reinforcement learning and planning, adversarial robustness and security, representation learning

0

0

0

0

7:25

13/04/2021

Robustness and scalability under heavy tails, without strong convexity

Matthew Holland

Keywords Paper

0

0

0

0

3:35

14/09/2020

Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method

Di Wang, Jinhui Xu

Keywords Paper

differential privacy, empirical risk minimization, private machine learning

0

0

0

0

15:13

13/04/2021

Sample elicitation

Jiaheng Wei, Zuyue Fu, Yang Liu and
Xingyu Li, Zhuoran Yang, Zhaoran Wang

Keywords Paper

0

0

0

0

3:16

26/04/2020

Meta Dropout: Learning to Perturb Latent Features for Generalization

Hae Beom Lee, Taewook Nam, Eunho Yang, Sung Ju Hwang

Keywords Paper

0

1

0

0

4:46

26/08/2020

RelatIF: Identifying Explanatory Training Samples via Relative Influence

Elnaz Barshan, Marc-Etienne Brunet, Gintare Karolina Dziugaite

Keywords Paper

0

0

0

1

14:03

02/02/2021

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

Yangyang Zhao, Zhenyu Wang, Zhenhua Huang

Keywords Paper

0

0

0

0

15:41