To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

16/11/2020

To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai, Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

Keywords: learning representations, downstream tasks, cross-view cvt, sequence tasks

Abstract Paper Similar Papers

Abstract: Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how to effectively use unlabeled data: by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data. CVT uses a much lighter model architecture and we show that it achieves similar performance to BERT on a set of sequence tagging tasks, with lesser financial and environmental impact.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Ji Xin, Raphael Tang, Jaejun Lee and
Yaoliang Yu, Jimmy Lin

Keywords Paper

Accelerating Inference, NLP applications, inference, real-time applications

0

0

0

0

6:56

16/11/2020

TernaryBERT: Distillation-aware Ultra-low Bit BERT

Wei Zhang, Lu Hou, Yichun Yin and
Lifeng Shang, Xiao Chen, Xin Jiang, Qun Liu

Keywords Paper

natural tasks, training process, transformer-based models, bert

0

0

0

0

8:41

26/04/2020

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Zhenzhong Lan, Mingda Chen, Sebastian Goodman and
Kevin Gimpel, Piyush Sharma, Radu Soricut

Keywords Paper

Natural Language Processing, BERT, Representation Learning

0

0

0

0

4:59

19/10/2020

TwinBERT: Distilling knowledge to twin-structured compressed BERT models for large-scale retrieval

Wenhao Lu, Jian Jiao, Ruofei Zhang

Keywords Paper

knowledge distillation, semantic embedding, sponsored search, bert, information retrieval, deep neural network, deep learning

0

0

0

0

10:20

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

05/12/2020

Towards non-task-specific distillation of BERT via sentence representation approximation

Bowen Wu, Huan Zhang, MengYuan Li and
Zongsheng Wang, Qihang Feng, Junhong Huang, Baoxun Wang

Keywords Paper

0

0

0

0

10:51

02/02/2021

Noninvasive Self-attention for Side Information Fusion in Sequential Recommendation

Chang Liu, Xiaoguang Li, Guohao Cai and
Zhenhua Dong, Hong Zhu, Lifeng Shang

Keywords Paper

0

0

0

0

18:26

06/12/2020

ConvBERT: Improving BERT with Span-based Dynamic Convolution

Zi-Hang Jiang, Weihao Yu, Daquan Zhou and
Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Keywords Paper

0

0

0

0

3:20

06/12/2021

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

Tatiana Likhomanenko, Qiantong Xu, Gabriel Synnaeve and
Ronan Collobert, Alex Rogozhnikov

Keywords Paper

deep learning, transformers

0

0

0

0

13:30

06/12/2020

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:23

04/07/2020

Relation Extraction with Explanation

Hamed Shahbazi, Xiaoli Fern, Reza Ghaeini, Prasad Tadepalli

Keywords Paper

relation extraction, Explanation, neural models, relation models

0

0

0

0

6:40

03/05/2021

FairBatch: Batch Selection for Model Fairness

Yuji Roh, Kangwook Lee, Steven Whang, Changho Suh

Keywords Paper

bilevel optimization, batch selection, model fairness

0

0

0

0

5:04

18/07/2021

Improved Denoising Diffusion Probabilistic Models

Alexander Nichol, Prafulla Dhariwal

Keywords Paper

Deep Learning, Generative Models, Theory, Game Theory and Computational Economics, Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

4:25

06/12/2021

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

Sakshi Varshney, Vinay Kumar Verma, P. K. Srijith and
Lawrence Carin, Piyush Rai

Keywords Paper

generative model, representation learning, continual learning

0

0

0

0

14:50

22/09/2020

DRecPy: A python framework for developing deep learning-based recommenders

Fábio Colaço, Márcia Barros, Francisco M. Couto

Keywords Paper

extensibility, reproducibility, evaluation, implementation, deep learning

0

0

0

0

2:43

16/11/2020

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

Yan Zhang, Ruidan He, Zuozhu Liu and
Kwan Hui Lim, Lidong Bing

Keywords Paper

sentence-pair tasks, clustering, semantic search, downstream tasks

0

0

0

0

12:22

03/05/2021

Neural Topic Model via Optimal Transport

He Zhao, Dinh Phung, Viet Huynh and
Trung Le, Wray Buntine

Keywords Paper

optimal transport, document analysis, topic modelling

0

0

0

1

9:29

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

22/09/2020

Closed-form models for collaborative filtering with side-information

Olivier Jeunen, Jan Van Balen, Bart Goethals

Keywords Paper

side-information, Ridge regression, item metadata

0

0

0

0

2:30

01/07/2020

Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT

Ashutosh Adhikari, Achyudh Ram, Raphael Tang and
William L. Hamilton, Jimmy Lin

Keywords Paper

0

0

0

0

4:55

06/12/2020

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

0

0

0

0

3:29

18/07/2021

Active Testing: Sample-Efficient Model Evaluation

Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

Keywords Paper

Algorithms, Active Learning

0

0

0

0

5:19

02/02/2021

Adversarial Training Reduces Information and Improves Transferability

Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Keywords Paper

0

0

0

0

19:54

22/11/2021

Elsa: Energy-based Learning for Semi-supervised Anomaly Detection

Sungwon Han, HyeonHo Song, Seung Eon Lee and
Sungwon Park, Meeyoung Cha

Keywords Paper

contrastive learning, energy-based learning, semi-supervised learning, anomaly detection

0

0

0

0

2:48

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

19/08/2021

Modelling General Properties of Nouns by Selectively Averaging Contextualised Embeddings

Na Li, Zied Bouraoui, Jose Camacho-Collados and
Luis Espinosa-Anke, Qing Gu, Steven Schockaert

Keywords Paper

Natural Language Processing, Natural Language Semantics, Natural Language Processing

0

0

0

0

14:09

26/08/2020

Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Esther Rolf, Michael Jordan, Benjamin Recht

Keywords Paper

0

0

0

0

14:27

04/07/2020

GAN-BERT: Generative Adversarial Learning for Robust Text Classification with a Bunch of Labeled Examples

Danilo Croce, Giuseppe Castellucci, Roberto Basili

Keywords Paper

Robust Classification, Natural tasks, image processing, generative setting

0

0

0

0

6:48

12/07/2020

Enhancing Simple Models by Exploiting What They Already Know

Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

Keywords Paper

Supervised Learning

0

0

0

0

13:57

02/02/2021

DecAug: Augmenting HOI Detection via Decomposition

Hao-Shu Fang, Yichen Xie, Dian Shao and
Yong-Lu Li, Cewu Lu

Keywords Paper

0

0

0

0

9:02

22/11/2021

DRT: Detection Refinement for Multiple Object Tracking

Bisheng Wang, Christian Fruhwirth-Reisinger, Horst Possegger and
Horst Bischof, Guo Cao

Keywords Paper

Multiple Object Tracking, Tracking by Detection, Detection Refinement

0

0

0

0

2:57

16/11/2020

Unsupervised Adaptation of Question Answering Systems via Generative Self-training

Steven Rennie, Etienne Marcheret, Neil Mallinar and
David Nahamoo, Vaibhava Goel

Keywords Paper

question-answering tasks, self-supervised tasks, word masking, sentence entailment

0

0

0

0

13:14

19/08/2021

Fast Multi-label Learning

Xiuwen Gong, Dong Yuan, Wei Bao

Keywords Paper

Machine Learning, Multi-instance; Multi-label; Multi-view learning

0

0

0

0

15:18

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

18/07/2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

Xiang Lin, Simeng Han, Shafiq Joty

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

16:00

06/12/2021

On Calibration and Out-of-Domain Generalization

Yoav Wald, Amir Feder, Daniel Greenfeld, Uri Shalit

Keywords Paper

machine learning, domain adaptation, causality

0

0

0

0

11:00

03/05/2021

Neural Pruning via Growing Regularization

Huan Wang, Can Qin, Yulun Zhang, Yun Fu

Keywords Paper

deep neural network pruning, regularization, Hessian matrix, model compression

0

0

0

0

6:15

14/06/2020

Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition

Linchao Zhu, Yi Yang

Keywords Paper

long-tailed visual recognition, region self-attention, inflated episodic memory, long-tailed video classification

0

0

0

0

1:00

18/07/2021

Accuracy, Interpretability, and Differential Privacy via Explainable Boosting

Harsha Nori, Rich Caruana, Woody Bu and
Judy Hanwen Shen, Janardhan Kulkarni

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:24

26/04/2020

Reducing Transformer Depth on Demand with Structured Dropout

Angela Fan, Edouard Grave, Armand Joulin

Keywords Paper

reduction, regularization, pruning, dropout, transformer

0

0

0

0

5:01