Diversity Enhanced Active Learning with Strictly Proper Scoring Rules

06/12/2021

Diversity Enhanced Active Learning with Strictly Proper Scoring Rules

Wei Tan, Lan Du, Wray Buntine

Keywords: machine learning, active learning

Abstract Paper Similar Papers

Abstract: We study acquisition functions for active learning (AL) for text classification. The Expected Loss Reduction (ELR) method focuses on a Bayesian estimate of the reduction in classification error, recently updated with Mean Objective Cost of Uncertainty (MOCU). We convert the ELR framework to estimate the increase in (strictly proper) scores like log probability or negative mean square error, which we call Bayesian Estimate of Mean Proper Scores (BEMPS). We also prove convergence results borrowing techniques used with MOCU. In order to allow better experimentation with the new acquisition functions, we develop a complementary batch AL algorithm, which encourages diversity in the vector of expected changes in scores for unlabelled data. To allow high performance text classifiers, we combine ensembling and dynamic validation set construction on pretrained language models. Extensive experimental evaluation then explores how these different acquisition functions perform. The results show that the use of mean square error and log probability with BEMPS yields robust acquisition functions, which consistently outperform the others tested.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection

Tian Shi, Liuqing Li, Ping Wang, Chandan K. Reddy

Keywords Paper

0

0

0

0

19:21

16/11/2020

Dynamic Data Selection and Weighting for Iterative Back-Translation

Zi-Yi Dou, Antonios Anastasopoulos, Graham Neubig

Keywords Paper

neural translation, neural nmt, nmt, domain adaptation

0

0

0

0

11:30

26/08/2020

Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer

Yanshuai Cao, Peng Xu

Keywords Paper

0

0

0

0

15:00

19/04/2021

Modelling context emotions using multi-task learning for emotion controlled dialog generation

Deeksha Varshney, Asif Ekbal, Pushpak Bhattacharyya

Keywords Paper

0

0

0

0

9:50

02/02/2021

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang, Yingming Li

Keywords Paper

0

0

0

0

15:12

19/04/2021

Does the order of training samples matter? Improving neural data-to-text generation with curriculum learning

Ernie Chang, Hui-Syuan Yeh, Vera Demberg

Keywords Paper

0

0

0

0

5:42

04/07/2020

Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks

Suchin Gururangan, Ana Marasović, Swabha Swayamdipta and
Kyle Lo, Iz Beltagy, Doug Downey, Noah A. Smith

Keywords Paper

NLP, classification tasks, pretraining, domain-adaptive pretraining

0

0

0

0

11:10

04/07/2020

Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction

Raphael Schumann, Lili Mou, Yao Lu and
Olga Vechtomova, Katja Markert

Keywords Paper

Unsupervised Summarization, Word-Level Extraction, Automatic summarization, Discrete Optimization

0

0

0

0

10:39

04/07/2020

Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation

Xuanli He, Gholamreza Haffari, Mohammad Norouzi

Keywords Paper

Subword Segmentation, Neural Translation, learning, inference

0

0

0

0

10:49

08/12/2020

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

Keywords Paper

0

0

0

0

14:39

04/07/2020

Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions

Hannah Craighead, Andrew Caines, Paula Buttery, Helen Yannakoudakis

Keywords Paper

automated transcriptions, automatically speech, multi-task learning, inductive transfer

0

0

0

0

11:37

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

06/12/2020

Unsupervised Text Generation by Learning from Search

Jingjing Li, Zichao Li, Lili Mou and
Xin Jiang, Michael Lyu, Irwin King

Keywords Paper

0

0

0

0

3:24

18/07/2021

Putting the ``Learning" into Learning-Augmented Algorithms for Frequency Estimation

Elbert Du, Franklyn Wang, Michael Mitzenmacher

Keywords Paper

Applications, Hardware and Systems

0

0

0

0

5:17

04/07/2020

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Keywords Paper

Automatic Evaluation, machine translation, Cross-lingual Model, regression model

0

0

0

0

7:17

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

08/12/2020

Bayesian Methods for Semi-supervised Text Annotation

Kristian Miok, Gregor Pirs, Marko Robnik-Sikonja

Keywords Paper

0

0

0

0

11:18

19/04/2021

El volumen louder por favor: Code-switching in task-oriented semantic parsing

Arash Einolghozati, Abhinav Arora, Lorena Sainz-Maza Lecanda and
Anuj Kumar, Sonal Gupta

Keywords Paper

0

0

0

0

11:39

03/05/2021

Representation Learning for Sequence Data with Deep Autoencoding Predictive Components

Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong

Keywords Paper

Unsupervised Learning, Mutual Information, Masked Reconstruction, Sequence Data

0

0

0

0

5:08

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

19/04/2021

Cognition-aware cognate detection

Diptesh Kanojia, Prashant Sharma, Sayali Ghodekar and
Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni

Keywords Paper

0

0

0

0

8:53

16/11/2020

PatchBERT: Just-in-Time, Out-of-Vocabulary Patching

Sangwhan Moon, Naoaki Okazaki

Keywords Paper

natural processing, downstream tasks, mitigation, large models

0

0

0

0

7:02

16/11/2020

X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset

Angel Daza, Anette Frank

Keywords Paper

generalization learning, multilingual learning, high-quality translation, srl

0

0

0

0

9:24

16/11/2020

Precise Task Formalization Matters in Winograd Schema Evaluations

Haokun Liu, William Huang, Dhara Mungra, Samuel R. Bowman

Keywords Paper

task formalization, input specification, ablation, formalization decisions

0

0

0

0

4:43

03/05/2021

A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks

Nikunj Saunshi, Sadhika Malladi, Sanjeev Arora

Keywords Paper

representation learning, self-supervised learning, language models, theory, transfer learning, natural language processing, unsupervised learning

0

0

0

0

5:16

04/07/2020

Self-Attention with Cross-Lingual Position Representation

Liang Ding, Longyue Wang, Dacheng Tao

Keywords Paper

natural tasks, WMT'17 tasks, Cross-Lingual Representation, Position encoding

0

0

0

0

7:46

16/11/2020

Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models

Pierangelo Lombardo, Alessio Boiardi, Luca Colombo and
Angelo Schiavone, Nicolò Tamagnone

Keywords Paper

content-based recommenders, construction, top-rank evaluation, semantic models

0

0

0

0

12:03

08/12/2020

Exploring diachronic syntactic shifts with dependency length: the case of scientific English

Tom S Juzek, Marie-Pauline Krielke, Elke Teich

Keywords Paper

0

0

0

0

15:27

16/11/2020

The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures

Haim Dubossarsky, Ivan Vulić, Roi Reichart, Anna Korhonen

Keywords Paper

cross-lingual tasks, large-scale study, bli, parsing

0

0

0

0

12:18

16/11/2020

Interactive Refinement of Cross-Lingual Word Embeddings

Michelle Yuan, Mozhi Zhang, Benjamin Van Durme and
Leah Findlater, Jordan Boyd-Graber

Keywords Paper

classification problem, cross-lingual embeddings, clime, interactive system

0

0

0

0

9:56

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

08/12/2020

Mitigating Silence in Compliance Terminology during Parsing of Utterances

Esme Manandise, Conrad de Peuter

Keywords Paper

0

0

0

0

17:48

08/12/2020

Exploring Cross-sentence Contexts for Named Entity Recognition with BERT

Jouni Luoma, Sampo Pyysalo

Keywords Paper

0

0

0

0

14:39

16/11/2020

Improving Text Generation with Student-Forcing Optimal Transport

Jianqiao Li, Chunyuan Li, Guoyin Wang and
Hao Fu, Yuhchen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang, Lawrence Carin

Keywords Paper

testing, ot learning, machine translation, text summarization

0

0

0

0

11:51

02/02/2021

A Unified Pretraining Framework for Passage Ranking and Expansion

Ming Yan, Chenliang Li, Bin Bi and
Wei Wang, Songfang Huang

Keywords Paper

0

0

0

0

16:33

26/04/2020

Multilingual Alignment of Contextual Word Representations

Steven Cao, Nikita Kitaev, Dan Klein

Keywords Paper

multilingual, natural language processing, embedding alignment, BERT, word embeddings, transfer

0

0

0

0

4:55

30/11/2020

Show, Conceive and Tell: Image Captioning with Prospective Linguistic Information

Yiqing Huang, Jiansheng Chen

Keywords Paper

0

0

0

0

7:08

05/12/2020

English intermediate-task training improves zero-shot cross-lingual transfer too

Jason Phang, Iacer Calixto, Phu Mon Htut and
Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

0

0

0

0

14:13

19/04/2021

Interpretability for morphological inflection: From character-level predictions to subword-level rules

Tatyana Ruzsics, Olga Sozinova, Ximena Gutierrez-Vasques, Tanja Samardzic

Keywords Paper

0

0

0

0

10:53

03/05/2021

Rethinking Positional Encoding in Language Pre-training

Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

4:49