Uncertain Decisions Facilitate Better Preference Learning

06/12/2021

Uncertain Decisions Facilitate Better Preference Learning

Cassidy Laidlaw, Stuart Russell

Keywords: theory, reinforcement learning and planning

Abstract Paper Similar Papers

Abstract: Existing observational approaches for learning human preferences, such as inverse reinforcement learning, usually make strong assumptions about the observability of the human's environment. However, in reality, people make many important decisions under uncertainty. To better understand preference learning in these cases, we study the setting of inverse decision theory (IDT), a previously proposed framework where a human is observed making non-sequential binary decisions under uncertainty. In IDT, the human's preferences are conveyed through their loss function, which expresses a tradeoff between different types of mistakes. We give the first statistical analysis of IDT, providing conditions necessary to identify these preferences and characterizing the sample complexity—the number of decisions that must be observed to learn the tradeoff the human is making to a desired precision. Interestingly, we show that it is actually easier to identify preferences when the decision problem is more uncertain. Furthermore, uncertain decision problems allow us to relax the unrealistic assumption that the human is an optimal decision maker but still identify their exact preferences; we give sample complexities in this suboptimal case as well. Our analysis contradicts the intuition that partial observability should make preference learning more difficult. It also provides a first step towards understanding and improving preference learning methods for uncertain and suboptimal humans.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Axioms for Learning from Pairwise Comparisons

Ritesh Noothigattu, Dominik Peters, Ariel Procaccia

Keywords Paper

0

0

0

0

3:27

14/09/2020

Interactive Multi-Objective Reinforcement Learning in Multi-Armed Bandits with Gaussian Process Utility Models

Diederik M. Roijers , Luisa M Zintgraf, Pieter Libin and
Mathieu Reymond , Eugenio Bargiacchi, Ann Nowé

Keywords Paper

multiple objectives, multi-armed bandits, thompson sampling, reinforcement learning

0

0

0

0

13:28

06/12/2020

Part-dependent Label Noise: Towards Instance-dependent Label Noise

Xiaobo Xia, Tongliang Liu, Bo Han and
Nannan Wang, Mingming Gong, Haifeng Liu, Gang Niu, Dacheng Tao, Masashi Sugiyama

Keywords Paper

0

0

0

0

3:00

25/07/2020

Asymmetric tri-training for debiasing missing-not-at-random explicit feedback

Yuta Saito

Keywords Paper

recommender systems, unsupervised domain adaptation, missing-not-at-random, matrix factorization, selection bias, explicit feedback

0

0

0

0

18:03

12/07/2020

Choice Set Optimization Under Discrete Choice Models of Group Decisions

Kiran Tomlinson, Austin Benson

Keywords Paper

Supervised Learning

0

0

0

0

15:05

02/02/2021

Model-Agnostic Fits for Understanding Information Seeking Patterns in Humans

Soumya Chatterjee, Pradeep Shenoy

Keywords Paper

0

0

0

0

14:04

26/04/2020

Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models

Joan Serrà, David Álvarez, Vicenç Gómez and
Olga Slizovskaia, José F. Núñez, Jordi Luque

Keywords Paper

OOD, generative models, likelihood

0

0

0

0

5:26

02/02/2021

GaussianPath:A Bayesian Multi-Hop Reasoning Framework for Knowledge Graph Reasoning

Guojia Wan, Bo Du

Keywords Paper

0

0

0

0

13:52

13/04/2021

Linear models are robust optimal under strategic behavior

Wei Tang, Chien-Ju Ho, Yang Liu

Keywords Paper

0

0

0

0

3:32

06/12/2021

Information Directed Reward Learning for Reinforcement Learning

David Lindner, Matteo Turchetta, Sebastian Tschiatschek and
Kamil Ciosek, Andreas Krause

Keywords Paper

reinforcement learning and planning, active learning

0

0

0

0

11:47

06/12/2020

Preference-based Reinforcement Learning with Finite-Time Guarantees

Yichong Xu, Ruosong Wang, Lin Yang and
Aarti Singh, Artur Dubrawski

Keywords Paper

0

0

0

0

3:04

18/07/2021

Alternative Microfoundations for Strategic Classification

Meena Jagadeesan, Celestine Mendler-Dünner, Moritz Hardt

Keywords Paper

Theory, Game Theory and Computational Economics

0

0

0

0

5:18

18/07/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Zaynah Javed, Daniel Brown, Satvik Sharma and
Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca Dragan, Ken Goldberg

Keywords Paper

Social Aspects of Machine Learning, AI Safety

0

0

0

1

5:10

06/12/2020

Learning Deep Attribution Priors Based On Prior Knowledge

Ethan Weinberger, Joe Janizek, Su-In Lee

Keywords Paper

0

0

0

0

4:20

25/07/2020

Sampler design for implicit feedback data by noisy-label robust learning

Wenhui Yu, Zheng Qin

Keywords Paper

collaborative filtering, bayesian point-wise optimization, noisy-label robust learning, negative sampling, item recommendation

0

0

0

0

12:25

03/08/2020

Dueling Posterior Sampling for Preference-Based Reinforcement Learning

Ellen Novoseller, Yibing Wei, Yanan Sui and
Yisong Yue, Joel Burdick

Keywords Paper

0

0

0

0

7:57

26/08/2020

Calibrated Prediction with Covariate Shift via Unsupervised Domain Adaptation

Sangdon Park, Osbert Bastani, James Weimer, Insup Lee

Keywords Paper

0

0

0

0

7:29

06/12/2020

Dynamic allocation of limited memory resources in reinforcement learning

Nisheet Patel, Luigi Acerbi, Alexandre Pouget

Keywords Paper

0

0

0

0

3:19

06/12/2020

Generalized Hindsight for Reinforcement Learning

Alex Li, Lerrel Pinto, Pieter Abbeel

Keywords Paper

0

0

0

0

3:20

06/12/2021

Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

Jingfeng Wu, Vladimir Braverman, Lin Yang

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:22

06/12/2021

Continuous Mean-Covariance Bandits

Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang

Keywords Paper

bandits

0

0

0

0

11:33

06/12/2021

RMIX: Learning Risk-Sensitive Policies forCooperative Reinforcement Learning Agents

Wei Qiu, Xinrun Wang, Runsheng Yu and
Rundong Wang, Xu He, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Keywords Paper

reinforcement learning and planning

0

0

0

0

3:17

06/12/2021

Learning Equilibria in Matching Markets from Bandit Feedback

Meena Jagadeesan, Alexander Wei, Yixin Wang and
Michael Jordan, Jacob Steinhardt

Keywords Paper

bandits

0

0

0

0

15:04

06/12/2021

Certifying Robustness to Programmable Data Bias in Decision Trees

Anna Meyer, Aws Albarghouthi, Loris D'Antoni

Keywords Paper

robustness, fairness

0

0

0

0

13:03

02/02/2021

Advice-Guided Reinforcement Learning in a non-Markovian Environment

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper

0

0

0

0

18:07

06/12/2021

Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback

Hang Wang, Sen Lin, Junshan Zhang

Keywords Paper

0

0

0

0

11:19

06/12/2020

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Sebastian Curi, Felix Berkenkamp, Andreas Krause

Keywords Paper

0

0

0

0

3:23

18/07/2021

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:06

18/07/2021

Learner-Private Convex Optimization

Jiaming Xu, Kuang Xu, Dana Yang

Keywords Paper

Theory, Online Learning Theory

0

0

0

0

5:24

16/11/2020

DORB: Dynamically Optimizing Multiple Rewards with Bandits

Ramakanth Pasunuru, Han Guo, Mohit Bansal

Keywords Paper

language tasks, optimization rewards, nlg tasks, question generation

0

0

0

0

11:34

12/07/2020

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

Umer Siddique, Paul Weng, Matthieu Zimmer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:17

02/02/2021

Apparently Irrational Choice as Optimal Sequential Decision Making

Haiyang Chen, Hyung Jin Chang, Andrew Howes

Keywords Paper

0

0

0

0

15:59

03/05/2021

Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions

Zhengxian Lin, Kin-Ho Lam, Alan Fern

Keywords Paper

Deep Reinforcement Learning, Explainable AI

0

0

0

0

14:19

06/12/2021

Learning to Generate Visual Questions with Noisy Supervision

Shen Kai, Lingfei Wu, Siliang Tang and
Yueting Zhuang, zhen he, Zhuoye Ding, Yun Xiao, Bo Long

Keywords Paper

generative model

0

0

0

0

14:54

12/07/2020

Data preprocessing to mitigate bias: A maximum entropy based approach

Elisa Celis, Vijay Keswani, Nisheeth Vishnoi

Keywords Paper

Fairness, Equity, Justice, and Safety

0

0

0

0

14:52

22/09/2020

Improving one-class recommendation with multi-tasking on various preference intensities

Chu-Jen Shao, Hao-Ming Fu, Pu-Jen Cheng

Keywords Paper

implicit feedback, graph convolutional network, one-class recommendation, collaborative filtering

0

0

0

0

2:38

05/01/2021

Adversarial Reinforcement Learning for Unsupervised Domain Adaptation

Youshan Zhang, Hui Ye, Brian D. Davison

Keywords Paper

0

0

0

0

4:52

06/12/2020

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

Aaron Sonabend, Junwei Lu, Leo Anthony Celi and
Tianxi Cai, Peter Szolovits

Keywords Paper

0

0

0

0

3:15

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

26/08/2020

Multi-attribute Bayesian optimization with interactive preference learning

Raul Astudillo, Peter Frazier

Keywords Paper

0

0

0

0

14:06