Learning Rewards From Linguistic Feedback

02/02/2021

Learning Rewards From Linguistic Feedback

Theodore R. Sumers, Mark K. Ho, Robert D. Hawkins, Karthik Narasimhan, Thomas L. Griffiths

Keywords:

Abstract Paper Similar Papers

Abstract: We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive learning from language assumes a particular form of input (e.g., commands). We propose a general framework which does not make this assumption, instead using aspect-based sentiment analysis to decompose feedback into sentiment over the features of a Markov decision process. We then infer the teacher's reward function by regressing the sentiment on the features, an analogue of inverse reinforcement learning. To evaluate our approach, we first collect a corpus of teaching behavior in a cooperative task where both teacher and learner are human. We implement three artificial learners: sentiment-based "literal" and "pragmatic" models, and an inference network trained end-to-end to predict rewards. We then re-run our initial experiment, pairing human teachers with these artificial learners. All three models successfully learn from interactive human feedback. The inference network approaches the performance of the "literal" sentiment model, while the "pragmatic" model nears human performance. Our work provides insight into the information structure of naturalistic linguistic feedback as well as methods to leverage it for reinforcement learning.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38949245

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning

Angeliki Lazaridou, Anna Potapenko, Olivier Tieleman

Keywords Paper

Multi-agent Communication, natural learning, visual task, Functional Learning

0

0

0

0

11:44

16/11/2020

Supervised Seeded Iterated Learning for Interactive Language Learning

Yuchen Lu, Soumye Singhal, Florian Strub and
Olivier Pietquin, Aaron Courville

Keywords Paper

language drift, language-drift game, language models, word-based agents

0

0

0

0

6:56

26/04/2020

On the interaction between supervision and self-play in emergent communication

Ryan Lowe, Abhinav Gupta, Jakob Foerster and
Douwe Kiela, Joelle Pineau

Keywords Paper

multi-agent communication, self-play, emergent languages

0

0

0

0

5:02

18/07/2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Kimin Lee, Laura Smith, Pieter Abbeel

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:02

02/02/2021

Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning

Binbin Xie, Jinsong Su, Yubin Ge and
Xiang Li, Jianwei Cui, Junfeng Yao, Bin Wang

Keywords Paper

0

0

0

0

15:57

03/05/2021

Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

Seanie Lee, Dong Bok Lee, Sung Ju Hwang

Keywords Paper

contrastive learning, conditional text generation

0

0

0

0

4:51

06/12/2021

Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation

Lin Guan, Mudit Verma, Suna (Sihang) Guo and
Ruohan Zhang, Subbarao Kambhampati

Keywords Paper

reinforcement learning and planning, machine learning

0

0

0

0

13:41

18/07/2021

Interactive Learning from Activity Description

Khanh Nguyen, Dipendra Misra, Robert Schapire and
Miro Dudik, Patrick Shafto

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:57

16/11/2020

Amalgamating Knowledge from Two Teachers for Task-oriented Dialogue System with Adversarial Training

Wanwei He, Min Yang, Rui Yan and
Chengming Li, Ying Shen, Ruifeng Xu

Keywords Paper

task completion, generating responses, task-oriented dialogue, task-oriented systems

0

0

0

0

9:15

02/02/2021

Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings?

Xuancheng Ren, Xu Sun, Houfeng Wang, Qun Liu

Keywords Paper

0

0

0

0

16:04

16/11/2020

Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models

Isabel Papadimitriou, Dan Jurafsky

Keywords Paper

analyzing structure, encoding structure, natural acquisition, transfer learning

0

0

0

0

11:44

06/12/2020

Learning to summarize with human feedback

Nisan Stiennon, Long Ouyang, Jeffrey Wu and
Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano

Keywords Paper

0

0

0

0

3:17

06/12/2020

Information-theoretic Task Selection for Meta-Reinforcement Learning

Ricardo Luna Gutierrez, Matteo Leonetti

Keywords Paper

0

0

0

0

2:57

06/12/2021

Teachable Reinforcement Learning via Advice Distillation

Olivia Watkins, Abhishek Gupta, Trevor Darrell and
Pieter Abbeel, Jacob Andreas

Keywords Paper

reinforcement learning and planning, active learning

0

0

0

0

12:45

06/12/2021

Iterative Teacher-Aware Learning

Luyao Yuan, Dongruo Zhou, Junhong Shen and
Jingdong Gao, Jeffrey L Chen, Quanquan Gu, Ying Nian Wu, Song-Chun Zhu

Keywords Paper

theory, optimization, reinforcement learning and planning, machine learning

0

0

0

0

6:40

16/11/2020

The EMPATHIC Framework for Task Learning from Implicit Human Feedback

Yuchen Cui, Qiping Zhang, Brad Knox and
Alessandro Allievi, Peter Stone, Scott Niekum

Keywords Paper

0

0

0

0

5:11

16/11/2020

Interactive Imitation Learning in State-Space

Snehal Jauhri, Carlos Celemin, Jens Kober

Keywords Paper

0

0

0

0

5:05

16/11/2020

Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Ethan Wilcox, Peng Qian, Richard Futrell and
Ryosuke Kohita, Roger Levy, Miguel Ballesteros

Keywords Paper

learning outcomes, syntactic representations, neural models, n-gram baseline

0

0

0

0

11:29

02/02/2021

Learning to Reweight with Deep Interactions

Yang Fan, Yingce Xia, Lijun Wu and
Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li

Keywords Paper

0

0

0

0

14:06

12/07/2020

Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training

Xuxi Chen, Wuyang Chen, Tianlong Chen and
Ye Yuan, Chen Gong, Kewei Chen, Zhangyang Wang

Keywords Paper

Supervised Learning

0

0

0

0

7:05

08/12/2020

Collective Wisdom: Improving Low-resource Neural Machine Translation using Adaptive Knowledge Distillation

Fahimeh Saleh, Wray Buntine, Gholamreza Haffari

Keywords Paper

0

0

0

0

9:03

22/11/2021

Class-Balanced Distillation for Long-Tailed Visual Recognition

Ahmet Iscen, Andre Araujo, Boqing Gong, Cordelia Schmid

Keywords Paper

Long tailed recognition, dataset imbalance

0

0

0

0

3:02

16/11/2020

Assessing the Helpfulness of Learning Materials with Inference-Based Learner-Like Agent

Yun-Hsuan Jen, Chieh-Yang Huang, MeiHua Chen and
Ting-Hao Huang, Lun-Wei Ku

Keywords Paper

sentence tasks, classroom study, english-as-a-second learners, inference-based agent

0

0

0

0

11:48

04/07/2020

Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context

Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, Christopher Meek

Keywords Paper

learning tasks, semantic parsing, mapping explanations, web-based tasks

0

0

0

0

12:12

06/12/2020

Learning Multi-Agent Communication through Structured Attentive Reasoning

Murtaza Rangwala, Ryan K Williams

Keywords Paper

0

0

0

1

3:21

02/02/2021

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

Yangyang Zhao, Zhenyu Wang, Zhenhua Huang

Keywords Paper

0

0

0

0

15:41

02/02/2021

The Sample Complexity of Teaching by Reinforcement on Q-Learning

Xuezhou Zhang, Shubham Bharti, Yuzhe Ma and
Adish Singla, Xiaojin Zhu

Keywords Paper

0

0

0

0

14:48

06/12/2021

Comprehensive Knowledge Distillation with Causal Intervention

Xiang Deng, Zhongfei Zhang

Keywords Paper

representation learning, causality

0

0

0

0

12:24

22/11/2021

Rich Semantics Improve Few-Shot Learning

Mohamed Afham Mohamed Aflal, Salman Khan, Muhammad Haris Khan and
Muzammal Naseer, Fahad Shahbaz Khan

Keywords Paper

few shot learning, multimodal learning, transformers in vision

0

0

0

0

2:47

06/12/2021

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer

Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal

Keywords Paper

language

0

0

0

0

10:13

06/12/2021

Teaching an Active Learner with Contrastive Examples

Chaoqi Wang, Adish Singla, Yuxin Chen

Keywords Paper

optimization, active learning

0

0

0

0

10:10

04/07/2020

Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

Qianhui Wu, Zijia Lin, Börje Karlsson and
Jian-Guang Lou, Biqing Huang

Keywords Paper

Single-/Multi-Source NER, named problem, cross-lingual NER, single-source NER

0

0

0

0

10:54

14/06/2020

Distilling Cross-Task Knowledge via Relationship Matching

Han-Jia Ye, Su Lu, De-Chuan Zhan

Keywords Paper

knowledge distillation, model reuse, knowledge transfer, cross-task learning, embedding learning

0

0

0

0

4:54

26/04/2020

Deep Symbolic Superoptimization Without Human Knowledge

Hui Shi, Yang Zhang, Xinyun Chen and
Yuandong Tian, Jishen Zhao

Keywords Paper

0

0

0

0

5:01

16/11/2020

Contrastive Distillation on Intermediate Representations for Language Model Compression

Siqi Sun, Zhe Gan, Yuwei Fang and
Yu Cheng, Shuohang Wang, Jingjing Liu

Keywords Paper

contrastive distillation, compress models, pre-training stages, existing methods

0

0

0

0

8:19

14/09/2020

Partial Label Learning via Self-Paced Curriculum Strategy

Gengyu Lyu, Songhe Feng, Yi Jin, Yidong Li

Keywords Paper

partial-label learning, self-paced learning strategy, curriculum learning strategy, instructor-student-collaborative

0

0

0

0

6:46

16/11/2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision

Hao Tan, Mohit Bansal

Keywords Paper

speaking, writing, text-only self-supervision, pure-language tasks

0

0

0

0

11:59

03/05/2021

Understanding and Improving Lexical Choice in Non-Autoregressive Translation

Liam Ding, Longyue Wang, Xuebo Liu and
Derek Wong, Dacheng Tao, Zhaopeng Tu

Keywords Paper

0

0

0

0

11:37

08/12/2020

Exploring Question-Specific Rewards for Generating Deep Questions

Yuxi Xie, Liangming Pan, Dongzhe Wang and
Min-Yen Kan, Yansong Feng

Keywords Paper

0

0

0

0

13:08

02/02/2021

An Adaptive Hybrid Framework for Cross-domain Aspect-based Sentiment Analysis

Yan Zhou, Fuqing Zhu, Pu Song and
Jizhong Han, Tao Guo, Songlin Hu

Keywords Paper

0

0

0

0

17:23