How Much Knowledge Can You Pack Into the Parameters of a Language Model?

16/11/2020

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

Adam Roberts, Colin Raffel, Noam Shazeer

Keywords: fine-tuning models, neural models, open-domain systems, model size

Abstract Paper Similar Papers

Abstract: It has recently been observed that neural language models trained on unstructured text can implicitly store and retrieve knowledge using natural language queries. In this short paper, we measure the practical utility of this approach by fine-tuning pre-trained models to answer questions without access to any external context or knowledge. We show that this approach scales with model size and performs competitively with open-domain systems that explicitly retrieve answers from an external knowledge source when answering questions. To facilitate reproducibility and future work, we release our code and trained models.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

Retrieval Augmented Language Model Pre-Training

Kelvin Guu, Kenton Lee, Zora Tung and
Panupong Pasupat, Mingwei Chang

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:44

03/05/2021

Distilling Knowledge from Reader to Retriever for Question Answering

Gautier Izacard, Edouard Grave

Keywords Paper

question answering, information retrieval

0

0

0

0

5:14

03/05/2021

Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning

Valerie Chen, Abhinav Gupta, Kenny Marino

Keywords Paper

0

0

0

0

5:04

06/12/2021

Neural Program Generation Modulo Static Analysis

Rohan Mukherjee, Yeming Wen, Dipak Chaudhari and
Thomas Reps, Swarat Chaudhuri, Christopher Jermaine

Keywords Paper

deep learning, transformers, generative model

0

0

0

0

14:58

04/07/2020

Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context

Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, Christopher Meek

Keywords Paper

learning tasks, semantic parsing, mapping explanations, web-based tasks

0

0

0

0

12:12

16/11/2020

Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning

Yuncheng Hua, Yuan-Fang Li, Gholamreza Haffari and
Guilin Qi, Tongtong Wu

Keywords Paper

program induction, meta-training, cqa, neural approach

0

0

0

0

12:41

04/07/2020

Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions

Tian Jin, Zhun Liu, Shengjia Yan and
Alexandre Eichenberger, Louis-Philippe Morency

Keywords Paper

Transfer learning, computer tasks, fine-tuning, Conditional Adaptation

0

0

0

0

5:42

06/12/2020

Learning Sparse Prototypes for Text Generation

Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig

Keywords Paper

0

0

0

0

3:22

16/11/2020

Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following

Valts Blukis, Ross Knepper, Yoav Artzi

Keywords Paper

0

0

0

0

5:06

03/05/2021

On the Dynamics of Training Attention Models

Haoye Lu, Yongyi Mao, Amiya Nayak

Keywords Paper

0

0

0

0

5:09

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

04/07/2020

Programming in Natural Language with fuSE: Synthesizing Methods from Spoken Utterances Using Deep Natural Language Understanding

Sebastian Weigelt, Vanessa Steurer, Tobias Hey, Walter F. Tichy

Keywords Paper

intelligent systems, information retrieval, Deep Understanding, end-user programming

0

0

0

0

11:41

06/12/2020

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Alon Talmor, Oyvind Tafjord, Peter Clark and
Yoav Goldberg, Jonathan Berant

Keywords Paper

0

0

0

0

3:28

18/07/2021

LTL2Action: Generalizing LTL Instructions for Multi-Task RL

Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila McIlraith

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:07

08/12/2020

Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models

Jheng-Hong Yang, Sheng-Chieh Lin, Rodrigo Nogueira and
Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin

Keywords Paper

0

0

0

0

9:14

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

16/11/2020

Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading

Yifan Gao, Chien-Sheng Wu, Jingjing Li and
Shafiq Joty, Steven C.H. Hoi, Caiming Xiong, Irwin King, Michael Lyu

Keywords Paper

document interpretation, dialog understanding, conversational reading, discern

0

0

0

0

11:47

15/11/2020

Interactive Synthesis of Temporal Specifications from Examples and Natural Language

Ivan Gavran, Eva Darulova, Rupak Majumdar

Keywords Paper

robots, program synthesis, LTL, specification, natural language processing

0

0

0

0

15:41

02/02/2021

LRSC: Learning Representations for Subspace Clustering

Changsheng Li, Chen Yang, Bo Liu and
Ye Yuan, Guoren Wang

Keywords Paper

0

0

0

0

15:09

04/07/2020

Tabula nearly Rasa: Probing the linguistic knowledge of character-level neural language models trained on unsegmented text

Michael Hahn, Marco Baroni

Keywords Paper

natural tasks, morphological tasks, language usage, Tabula

0

0

0

0

14:40

06/12/2021

Sequence-to-Sequence Learning with Latent Neural Grammars

Yoon Kim

Keywords Paper

deep learning

0

0

0

0

14:31

18/07/2021

Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification

Bo Pang, Ying Nian Wu

Keywords Paper

Algorithms, Unsupervised Learning

0

0

0

0

5:17

29/06/2020

Improved automatic summarization of subroutines via attention to file context

Sakib Haque, Alexander LeClair, Lingfei Wu, Collin McMillan

Keywords Paper

neural networks, natural language processing, documentation generation, source code summarization, artificial intelligence

0

0

0

0

16:04

06/12/2020

Towards Neural Programming Interfaces

Zachary Brown, Nathaniel Robinson, David Wingate, Nancy Fulda

Keywords Paper

0

0

0

0

3:12

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

16/11/2020

Coarse-to-Fine Pre-training for Named Entity Recognition

Xue Mengge, Bowen Yu, Zhenyu Zhang and
Tingwen Liu, Yue Zhang, Bin Wang

Keywords Paper

named recognition, bert, en-tity task, pre-trainingapproaches

0

0

0

0

9:23

08/12/2020

On a Chatbot Navigating a User through a Concept-Based Knowledge Model

Boris Galitsky, Dmitry Ilvovsky, Elizaveta Goncharova

Keywords Paper

0

0

0

0

14:49

08/12/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

0

0

0

0

15:01

29/06/2020

Embedding java classes with Code2vec: Improvements from variable obfuscation

Rhys Compton, Eibe Frank, Panos Patros, Abigail Koay

Keywords Paper

code2vec, machine learning, code obfuscation, source code, neural networks

0

0

0

0

14:20

16/11/2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision

Hao Tan, Mohit Bansal

Keywords Paper

speaking, writing, text-only self-supervision, pure-language tasks

0

0

0

0

11:59

04/07/2020

SenseBERT: Driving Some Sense into BERT

Yoav Levine, Barak Lenz, Or Dagan and
Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham

Keywords Paper

natural understanding, lexical understanding, SemEval Disambiguation, task

0

0

0

0

10:53

16/11/2020

PathQG: Neural Question Generation from Facts

Siyuan Wang, Zhongyu Wei, Zhihao Fan and
Zengfeng Huang, Weijian Sun, Qi Zhang, Xuanjing Huang

Keywords Paper

question generation, query learning, query-based generation, sequence problem

0

0

0

0

11:16

18/07/2021

Grey-box Extraction of Natural Language Models

Santiago Zanella-Beguelin, Shruti Tople, Andrew Paverd, Boris Köpf

Keywords Paper

Algorithms, Unsupervised Learning, Probabilistic Methods; Probabilistic Methods, Graphical Models, Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:22

19/04/2021

BERTese: Learning to speak to BERT

Adi Haviv, Jonathan Berant, Amir Globerson

Keywords Paper

0

0

0

0

6:54

06/12/2020

Latent Template Induction with Gumbel-CRFs

Yao Fu, Chuanqi Tan, Bin Bi and
Mosha Chen, Yansong Feng, Alexander Rush

Keywords Paper

0

0

0

0

3:14

02/02/2021

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Peng Shi, Patrick Ng, Zhiguo Wang and
Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira dos Santos, Bing Xiang

Keywords Paper

0

0

0

0

15:15

04/07/2020

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Yiping Song, Zequn Liu, Wei Bi and
Rui Yan, Ming Zhang

Keywords Paper

Few-shot Tasks, open-domain systems, generative models, meta-learning framework

0

0

0

0

11:43

04/07/2020

IMoJIE: Iterative Memory-Based Joint Open Information Extraction

Keshav Kolluru, Samarth Aggarwal, Vipul Rathore and
Mausam -, Soumen Chakrabarti

Keywords Paper

Iterative Extraction, Open Extraction, IMoJIE, Iterative

0

0

0

0

9:31

04/07/2020

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

0

0

0

0

11:48

16/11/2020

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning

Tao Shen, Yi Mao, Pengcheng He and
Guodong Long, Adam Trischler, Weizhu Chen

Keywords Paper

self-supervised tasks, pre-training, entity linking, finetuning

0

0

0

0

11:38