Text Classification Using Label Names Only: A Language Model Self-Training Approach

16/11/2020

Text Classification Using Label Names Only: A Language Model Self-Training Approach

Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han

Keywords: classification, category understanding, document classification, topic classification

Abstract Paper Similar Papers

Abstract: Current text classification methods typically require a good number of human-labeled documents as training data, which can be costly and difficult to obtain in real applications. Humans can perform classification without seeing any labeled examples but only based on a small set of words describing the categories to be classified. In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents. We use pre-trained neural language models both as general linguistic knowledge sources for category understanding and as representation learning models for document classification. Our method (1) associates semantically related words with the label names, (2) finds category-indicative words and trains the model to predict their implied categories, and (3) generalizes the model via self-training. We show that our model achieves around 90% accuracy on four benchmark datasets including topic and sentiment classification without using any labeled documents but learning from unlabeled data supervised by at most 3 words (1 in most cases) per class as the label name.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision

Hao Tan, Mohit Bansal

Keywords Paper

speaking, writing, text-only self-supervision, pure-language tasks

0

0

0

0

11:59

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

04/07/2020

Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context

Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, Christopher Meek

Keywords Paper

learning tasks, semantic parsing, mapping explanations, web-based tasks

0

0

0

0

12:12

06/12/2020

Learning Compositional Rules via Neural Program Synthesis

Maxwell Nye, Armando Solar-Lezama, Josh Tenenbaum, Brenden Lake

Keywords Paper

0

0

0

0

3:07

14/06/2020

TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning

Zhongjie Yu, Lin Chen, Zhongwei Cheng, Jiebo Luo

Keywords Paper

few-shot learning, semi-supervised learning, meta-learning

0

0

0

0

1:01

16/11/2020

Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang and
Zhiyuan Liu, Maosong Sun

Keywords Paper

pre-training stage, fine-tuning stage, general pre-training, sentiment tasks

0

0

0

0

7:02

06/12/2020

Uncertainty-aware Self-training for Few-shot Text Classification

Subhabrata Mukherjee, Ahmed Awadallah

Keywords Paper

0

0

0

0

3:16

08/12/2020

Task-Aware Representation of Sentences for Generic Text Classification

Kishaloy Halder, Alan Akbik, Josip Krapac, Roland Vollgraf

Keywords Paper

0

0

0

0

12:37

16/11/2020

Learning a natural-language to LTL executable semantic parser for grounded robotics

Christopher Wang, Candace Ross, Yen-Ling Kuo and
Boris Katz, Andrei Barbu

Keywords Paper

0

0

0

0

5:01

04/07/2020

LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation

Dong-Ho Lee, Rahul Khanna, Bill Yuchen Lin and
Seyeon Lee, Qinyuan Ye, Elizabeth Boschee, Leonardo Neves, Xiang Ren

Keywords Paper

sequence tasks, NLP tasks, named recognition, relation extraction

0

0

0

0

11:44

16/11/2020

Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Ethan Wilcox, Peng Qian, Richard Futrell and
Ryosuke Kohita, Roger Levy, Miguel Ballesteros

Keywords Paper

learning outcomes, syntactic representations, neural models, n-gram baseline

0

0

0

0

11:29

02/02/2021

Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings?

Xuancheng Ren, Xu Sun, Houfeng Wang, Qun Liu

Keywords Paper

0

0

0

0

16:04

03/05/2021

Learning Task-General Representations with Generative Neuro-Symbolic Modeling

Reuben Feinman, Brenden Lake

Keywords Paper

probabilistic programs, neuro-symbolic models, few-shot concept learning, generative models

0

0

0

0

6:13

03/05/2021

Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning

Valerie Chen, Abhinav Gupta, Kenny Marino

Keywords Paper

0

0

0

0

5:04

02/02/2021

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Peng Shi, Patrick Ng, Zhiguo Wang and
Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira dos Santos, Bing Xiang

Keywords Paper

0

0

0

0

15:15

06/12/2020

Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

Weili Nie, Zhiding Yu, Lei Mao and
Ankit Patel, Yuke Zhu, Anima Anandkumar

Keywords Paper

0

0

0

0

3:23

16/11/2020

Learning Structured Representations of Entity Names using ActiveLearning and Weak Supervision

Kun Qian, Poornima Chozhiyath Raman, Yunyao Li, Lucian Popa

Keywords Paper

entity-related tasks, entity normalization, variant generation, implicit names

0

0

0

0

6:58

16/11/2020

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

response selection, retrieval-based systems, learning-to-rank problem, learning-to-rank

0

0

0

0

12:03

12/07/2020

Retrieval Augmented Language Model Pre-Training

Kelvin Guu, Kenton Lee, Zora Tung and
Panupong Pasupat, Mingwei Chang

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:44

16/11/2020

Named Entity Recognition Only from Word Embeddings

Ying Luo, Hai Zhao, Junlang Zhan

Keywords Paper

named recognition, entity detection, type prediction, deep models

0

0

0

0

9:54

08/12/2020

TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching

Heng Gong, Yawei Sun, Xiaocheng Feng and
Bing Qin, Wei Bi, Xiaojiang Liu, Ting Liu

Keywords Paper

0

0

0

0

8:45

18/07/2021

Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification

Bo Pang, Ying Nian Wu

Keywords Paper

Algorithms, Unsupervised Learning

0

0

0

0

5:17

12/07/2020

Learning Compound Tasks without Task-specific Knowledge via Imitation and Self-supervised Learning

Sang-Hyun Lee, Seung-Woo Seo

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:37

22/11/2021

Simpler Does It: Generating Semantic Labels with Objectness Guidance

Md Amirul Islam, Matthew Kowal, Sen Jia and
Konstantinos Derpanis, Neil Bruce

Keywords Paper

Weakly supervised segmentation, semi supervised segmentation, Pseudo-label generation, Class Activation Maps, Objectness, Saliency

0

0

0

0

3:02

16/11/2020

Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models

Isabel Papadimitriou, Dan Jurafsky

Keywords Paper

analyzing structure, encoding structure, natural acquisition, transfer learning

0

0

0

0

11:44

04/07/2020

Contextualized Weak Supervision for Text Classification

Dheeraj Mekala, Jingbo Shang

Keywords Paper

Text Classification, Weakly classification, string matching, Contextualized Supervision

0

0

0

0

11:26

16/11/2020

Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning

Tao Shen, Yi Mao, Pengcheng He and
Guodong Long, Adam Trischler, Weizhu Chen

Keywords Paper

self-supervised tasks, pre-training, entity linking, finetuning

0

0

0

0

11:38

16/11/2020

Visually Grounded Continual Learning of Compositional Phrases

Xisen Jin, Junyi Du, Arka Sadhu and
Ram Nevatia, Xiang Ren

Keywords Paper

visually task, continual phrases, visually-grounded task, compositional generalization

0

0

0

0

10:50

05/12/2020

Systematic generalization on gSCAN with language conditioned embedding

Tong Gao, Qi Huang, Raymond Mooney

Keywords Paper

0

0

0

0

14:19

03/05/2021

Disambiguating Symbolic Expressions in Informal Documents

Dennis Müller, Cezary Kaliszyk

Keywords Paper

0

0

0

0

5:08

03/05/2021

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing

Tao Yu, Rui Zhang, Alex Polozov and
Christopher Meek, Ahmed H Awadallah

Keywords Paper

0

0

0

0

5:11

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

22/11/2021

Rich Semantics Improve Few-Shot Learning

Mohamed Afham Mohamed Aflal, Salman Khan, Muhammad Haris Khan and
Muzammal Naseer, Fahad Shahbaz Khan

Keywords Paper

few shot learning, multimodal learning, transformers in vision

0

0

0

0

2:47

16/11/2020

Ad-hoc Document Retrieval using Weak-Supervision with BERT and GPT2

Yosi Mass, Haggai Roitman

Keywords Paper

ad-hoc retrieval, manually data, weakly-supervised method, deep models

0

0

0

0

8:03

16/11/2020

Coarse-to-Fine Pre-training for Named Entity Recognition

Xue Mengge, Bowen Yu, Zhenyu Zhang and
Tingwen Liu, Yue Zhang, Bin Wang

Keywords Paper

named recognition, bert, en-tity task, pre-trainingapproaches

0

0

0

0

9:23

06/12/2020

Learning Sparse Prototypes for Text Generation

Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig

Keywords Paper

0

0

0

0

3:22

04/07/2020

Tabula nearly Rasa: Probing the linguistic knowledge of character-level neural language models trained on unsegmented text

Michael Hahn, Marco Baroni

Keywords Paper

natural tasks, morphological tasks, language usage, Tabula

0

0

0

0

14:40

05/01/2021

Utilizing Every Image Object for Semi-Supervised Phrase Grounding

Haidong Zhu, Arka Sadhu, Zhaoheng Zheng, Ram Nevatia

Keywords Paper

0

0

0

0

4:57

04/07/2020

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

Zhiqing Sun, Hongkun Yu, Xiaodan Song and
Renjie Liu, Yiming Yang, Denny Zhou

Keywords Paper

Natural NLP, NLP tasks, knowledge transfer, natural tasks

0

0

0

0

11:10

26/04/2020

Meta-Learning without Memorization

Mingzhang Yin, George Tucker, Mingyuan Zhou and
Sergey Levine, Chelsea Finn

Keywords Paper

meta-learning, memorization, regularization, overfitting, mutually-exclusive

0

0

0

0

5:09